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Section I: expanded methods 

This document provides supplementary and detailed analysis information not 
included in the paper. Other sources of information and the original datasets can 
be found in our web site www.genome.wi. mit.edu/MPR/CNS. 

Patient data and tumor bank 

The complete cohort for these studies consists of 68 children with 
medulloblastomas, 10 young adults with malignant gliomas (WHO grades III and 
IV), 5 children with AT/RT, 5 with renal/extrarenal rhabdoid tumors, and 8 children 
with supratentorial PNETs. A summary of the clinical data for the patients can be 
found in the List of all samples section of the document. All patients with 
medulloblastomas were treated with craniospinal irradiation to 2400 - 3600 
centiGray (cGy) with a tumor dose of 5300 - 7200 cGy. All patients with 
medulloblastomas were treated with chemotherapy consisting of cisplatin and 
vincristine, and combinations of carboplatin, etoposide, cyclophosphamide, 
procarbor lomustine (CCNU). Two patients received high dose chemotherapy at 
relapse, including methotrexate and thiotepa, followed by autologous bone marrow 
transplantation. Thirty-five of the children with medulloblastomas were part of a 
cohort described in previous publications (Segal et al 1994, Kim et al 1999). All 
tumor samples were obtained at the time of initial surgery prior to treatment: The 
samples were snap frozen in liquid nitrogen and stored at -80°C. The studies were ~ 
done with approval of the Committee for Clinical Investigation of Boston Children's 
Hospital. The data were organized into three sets: Dataset A (42 samples 
containing: 10 medulloblastomas, 10 malignant gliomas, 5 AT/RT and 5 
renal/extrarenal rhabdoid tumors, 8 supratentorial PNETs and 4 normal cerebella), 
Dataset B (34 samples, containing 9 desmoplastic medulioblastoma and 25 
classic medulioblastoma), and Dataset C (60 samples, containing 39 
medulioblastoma survivors and 21 treatment failures). There are two additional 
variants of Dataset A called A1 and A2 described in the second section of this 
document. A description of each dataset is available in the Datasets and clinical 
attributes section of this document. 

Microarray hybridization 

For a detailed protocol, see http://www.qenome.mit.edu/MPR/CNS . Briefly, tissue 
samples were homogenized (Polytron, Kinematica, Lucerne) in guanidinium 
isothiocyanate and RNA was isolated by centrifugation over a CsCI gradient. RNA 
integrity was assessed either by northern blotting (Kim et al 1999) or by gel 
electrophoresis. The amount of starting total RNA for each reaction varied 
between 10 and 12 \xg. First strand cDNA synthesis was generated using a T7- 
linked oligo-dT primer, followed by second strand synthesis. An in vitro 
transcription reaction was done to generate the cRNA containing biotinylated UTP 
and CTP, which was subsequently chemically fragmented at 95°C for 35 minutes. " 
Ten micrograms of the fragmented, biotinylated cRNA was hybridized in MES 
buffer (2-[N-Morpholino]ethansulfonic acid) containing 0.5 mg/ml acetylated bovine 
serum albumin (Sigma, St. Louis) to Affymetrix (Santa Clara, CA) HuGeneFL 
arrays at 45°C for 16 hours. HuGeneFL arrays contain 5920 known genes and 
897 expressed sequence tags. Arrays were washed and stained with streptavidin- 



3 



phycoerythrin (SAPE, Molecular Probes). Signal amplification was performed 
using a biotinylated anti-streptavidin antibody (Vector Laboratories, Burlingame, 
CA) at 3 jag/ml. This was followed by a second staining with SAPE. Normal goat 
IgG (2 mg/ml) was used as a blocking agent. Scans were performed on Affymetrix 
scanners and the expression value for each gene was calculated using Affymetrix 
GENECHIP software. . Minor differences in microarray intensity were corrected 
using a linear scaling method as detailed in the next section. 

Preprocessing and re-scaling 

The raw expression data as obtained from Affymetrix's GeneChip is re-scaled to 
account for different chip intensities. Each column (sample) in the dataset was 
multiplied by 1/slope of a least squares linear fit of the sample vs. the reference 
(the first sample in the dataset). This linear fit is done using only genes that have 
'Present 1 calls in both the sample being re-scaled and the reference. The sample 
chosen as reference is a typical one (i.e. one with the number of "P" calls closer to 
the average over all samples in the dataset). Scans were rejected if the scaling 
factor exceeded a factor of 3, fewer than 1000 genes received 'Present' calls, or 
microarray artifacts were visible. 

A ceiling of 16,000 units was chosen for all experiments because it is at this level 
that we observe fluorescence saturation of the scanner; values above this cannot 
be reliably measured. For classification problems that are very robust (e.g. 
distinguishing different types of brain tumors), we used a threshold of 100 units 
because there was a sufficiently large number of genes correlated with the 
distinction that the threshold could be set high, thereby minimizing noise, and 
maximizing potential biological interpretation of the marker genes. For the more 
subtle distinctions (e.g. outcome prediction), few correlates of the distinction are 
found, and for this reason the threshold was set at a lower level (20 units) so as to 
avoid missing any potentially informative marker genes. 

These numbers are Affymetrix's scanner "average difference" units. After this 
preprocessing gene expression values were subjected to a variation filter which 
excluded genes showing minimal variation across the samples being analyzed. 
The variation filter tests for a fold-change and absolute variation over samples 
(comparing max/min and max-min with predefined values and excluding genes not 
obeying both conditions). The precise parameters of the variation filters for each 
dataset are provided in each analysis section of this document. Different 
thresholds and variation filters were used according to the purpose of the analysis 
(e.g. select weak marker genes for treatment outcome, strong robust marker 
genes for morphology, highly varying genes for PCA etc.). 
For example, if the maximum and minimum values of a gene across samples were 
max and min then the variation filter excluded those where max/min < 5 and max 
- min < 500. In some cases more or less stringent values were used. 



Clustering. 

Self Organizing Maps were performed using our GeneCluster clustering package 
available at www.genome.wi.mit.edu/MPR/Software. Self-Organizing Maps 
(SOMs). The Self Organizing Map is a method for performing unsupervised 
learning (i.e., learning models for classifying data where the true class for the data 



samples is assumed to be unknown prior to model training) where a grid of 2D 
nodes (clusters) is iteratively adjusted to reflect the global structure in the 
expression dataset (Tamayo et al 1999). In general, unsupervised learning 
presents a more difficult problem than supervised learning methods (such as 
weighted voting or k-NN) but is useful for discovering new classes during 
exploratory analysis. With the SOM, one randomly chooses the geometry of the 
grid (e.g., a 3 x 2 grid) and maps it into the k-dimensional feature space. Initially 
the features are randomly mapped to the grid but during training the mapping is 
iteratively adjusted to reflect the data structure. The data were first normalized by 
standardizing each column (sample) to mean 0 and variance 1 . The SOM results 
for the clustering of samples can be found in the Multiple tumor clustering for 
multiple tumor samples and in the SOM clustering of treatment outcome samples. 
section for the clustering of medulloblastomas. 

Hierarchical Clustering is another unsupervised learning method useful for dividing 
data into natural groups. Data is clustered hierarchically by organizing the data 
into a tree structure based upon the degree of similarity between features. We 
used the Cluster and TreeView software (Eisen et al 1998) to perform average 
linkage clustering, which organizes all of the data elements into a single tree with 
the highest levels of the tree representing the discovered classes. The detailed 
clustering results can be accessed in the Multiple tumor clustering section. 



Supervised learning. 

This is the methodology for building a supervised classifier that we followed. 

a) define a target class based on morphology, tumor class or treatment outcome 
clinical information; 

b) select the "marker" genes with the highest correlation with the target class 
using a class separation statistic (signal-to-noise ratio). A permutation test is 
also applied to the top ranked genes to assess their class-correlation statistical 
significance. 

c) build a classifier in cross-validation (leave-one-out) by removing one sample 
and then used the rest as a training set 

d) several models are built using different number of marker genes and the final 
r chosen model is the one that minimizes the total error in cross-validation 

e) evaluate prediction results, compute confusion matrices and produce Kaplan- 
Meier survival plots. 



/ 
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This methodology was used with the following algorithms: k-nearest neighbors, 
weighted voting, support vector machines, SPLASH, metastatic staging, TrkC 
gene expression and two combined predictors. The details for each algorithm are 
described below. 

Gene marker selection 

Genes correlated with a particular class distinctions (e.g. class 0 and class 1 ) were 
identified by sorting all of the genes on the array according the signal-to-noise 
statistic (Golub et al 1 999, Slonim et al 2000) (i^ass o - Mciass i)/(cr C i aS s o + dciassi) 
where \x and a represent the mean (or median) and standard deviation of 
expression, respectively, for each class. Permutation of the column (sample) 
labels was performed to compare these correlations to what would be expected by, 
chance (see the next section). These marker genes were used to build the k- 
nearest neighbor and weighted voting classifiers. SVM and SPLASH use different 
methods to select marker genes. 

In Section III we described marker genes for several classifications: 

o multi tumor classes ( Multiple tumor class markers ). 

o classic vs. desmoplastic medulloblastoma morphology ( Classic vs. 

desmoplastic MP markers ) 
o SOM-discovered medulloblastoma classes (SOM-discovered CO vs. C1 

class gene markers ) and, 
o medulloblastoma treatment outcome ( Treatment outcome markers ). 



Permutation-based neighborhood analysis for marker gene 
selection and screening. 

Before we describe the method in detail we provide some motivation for use of the 
technique and put it in context with other multiple comparison and permutation test 
approaches. 

There are two interrelated problems that we have addressed with our permutation- 
based neighborhood analysis first introduced in Golub et al 1999 and Slonim et al 
2000. One is the problem of feature (gene) selection in terms of how many and 
which genes to input to a supervised learning classifier. This process is a necessary 
step in a supervised learning methodology as many classifier algorithms cannot deal 
with thousands of input variables and require some type of dimensionality reduction 
or prior selection. The other problem is to choose statistically significant molecular 
markers or differentially expressed genes that deserve more detailed biological 
study. For example, the ones that one may choose for further validation using a 
different technology or experimental technique (e.g. RT-PCR, immunochemistry, 
etc.). 

It is important to point out that in these two problems one is basically interested in 
selecting the subset of genes more likely to be useful in discriminating the 
phenotype of interest either as single markers or in combination with others. In other 
words we are interested in a ranking and screening process that identifies enough of 
the relevant features. One can easily tolerate some amount of false positive errors in 
exchange for higher sensitivity. Most of the current molecular classification problems 
of interest, such as morphological or lineage distinction, treatment outcome 
prediction, drug resistance etc. fit this scenario. These problems involve the 
presence of a potentially weak signal, for example a few marker genes in a 
background of technical variation and noise, and therefore favor marker selection 
methods that are very sensitive and have enough statistical power to produce non- 
empty results. One can tolerate some number of false positive errors because the 
selected marker genes are usually weighted or further selected by the classifier. This 
is also the case in determining biological significance/relevance; in any serious 
follow up study the markers would have to be further validated. This need for higher 
sensitivity and adaptability to the dataset being analyzed is one of the main 
motivations behind our approach as originally applied to Leukemia subtype 
distinction in Golub et al 1999 and in its current form in this paper. 

Marker gene selection, with the characteristics described in the previous 
paragraph, can be seen as an example of a Multiple Comparison Procedure 
(MCP) where multiple hypotheses (genes) are tested simultaneously and then 
accepted or rejected according to a testing procedure. For recent reviews on MCP 
see Hochberg and Tamhane1997, Bender and Lange 2001, and the special issue 
on Multiple Comparisons of the Journal of Statistical Planning and Inference (vol. 
82, 1999). In our case, each statistical test of a gene can be seen as testing a null 
hypothesis on the equivalence of the phenotype classes. In this way rejecting a 
subset of the hypotheses corresponds to selecting a set of statistically significant 
differentially expressed genes at a given significance level. A global null 
hypothesis would assert that no gene expression changes are significant and 
therefore that there is no significant difference between the biological classes as 
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measured by the entire set of microarrays. Notice that this situation of all null 
hypotheses being true is not likely to be realistic because in practice most 
microarrays experiments are done with biological classes with known differences 
that are usually reflected in multiple genes. A more realistic situation is one in 
which a subset of the hypotheses are false corresponds to the usual problem of 
selecting between a few dozens and few hundred genes. Traditional approaches 
to the MCP assume that all, or almost all, of the null hypotheses are true. They 
also control for the Family Wise Error Rate (FWER), i.e., the probability that 
exactly one, or at least one, type I (false positive) error occurs. In the marker 
selection problem this would be the case where there was only one wrong marker 
gene in the determined marker set. However, the models we construct are not 
really sensitive to a small number of false positives in the selected marker set. 
Thus, controlling the FWER is an overly conservative approach that does not 
provide enough statistical power for the purposes of marker selection and may 
actually yield no candidate marker genes. This situation of partial rejection is 
actually quite common in exploratory data analysis and in recent years alternative 
less conservative and more sensitive formulations of the MCP have been 
introduced. These methods control the False Discovery Rate (FDR) rather than 
the FWER (Benjamin! and Hochberg 1995). The FDR is the total number of type I, 
or false positive errors, that are made by the MCP. Controlling for this quantity 
moves the MCP closer to the type of approaches used in machine learning feature 
selection and leads to methods with higher statistical power. Statistically the FDR 
is a compromise between an ultra conservative correction a la Bonferroni and 
making no correction at all (Benjamini et al 2001 ). This type of approach is clearly 
more appropriate for gene selection. 

Regardless of the assumption on the number of true hypotheses, or the emphasis 
on FWER or FDR, the real problem in multiple comparisons is that the hypotheses 
(genes) are correlated in complex ways reflecting the structure of genetic pathways 
and interactions. This makes the dependence structure of the data quite difficult to 
analyze or capture in a test. Traditional corrections, such as Bonferroni, are too 
conservative and produce essentially no marker genes except for cases where the 
differences are overwhelming (e.g. dead tissue vs. live tissue). Less conservative 
approaches attempt to solve this problem by using close testing step-wise methods 
where the hypotheses can be tested in a specific order and decisions made in a 
step-wise manner. Decisions on earlier hypotheses may affect later decisions and 
in this way the dependent structure can be taken into account (see for example 
Tamhane 1996, Tamhane and Dunnett 1999, Somerville 1999, Troendle 2000). On 
a parallel track, MCP methods to increase the statistical power by resampling 
(bootstrap) have also been introduced (Westfall and Young 1993). These methods 
control the FWER but resample the empirical null distribution to provide less 
conservative corrections for the p-values. Some of these methods are included in 
the PROC MULTTEST procedure in SAS (Westfall and Wolfinger 1999). 

The comparison of FWER vs. FDR, step-wise vs. single step, resampling vs. 
analytical p-value v adjustments, and in general the assessment of the virtues or 
applicability of different MCP methods, continues today. It has generated a healthy 
debate in the statistics community (see for Benjamini et al 1999, Bender and Lang 
2001 ). Another perspective on the MCP can be obtained by Bayesian methods 
(See Berry and Hochberg 1 999 for a review). 
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From the perspective of machine learning and pattern recognition, the problem of 
optimal feature selection is intractable and one has to be content with empirical 
approximations that may have to be tailored to fit the application (Duda, Hart and 
Stork 2001 ). Two common approaches are based on the use of filters and wrappers 
(Kohavi and John 1998). Filter approaches select the best features using a score 
function that measures the discrimination power of the feature with respect with to 
the target in a way similar to the test statistic is used in MCP. Typical score functions 
are, for example, the mutual information, signal to noise ratios, Naive Bayes 
posteriors, inner products, linear transformation (e.g. eigenvectors of the covariance 
matrix), or bounds on the Bayes error such as the Bhattacharyya distance. Wrapper 
methods involve the use of the actual classifier in the selection process and can be 
seen as non-linear optimization problems. For more details see Kohavi and John 
1998, Cherkassky and Mulier 1998, Fukunaga 1990, Kearns and Vazirani 1997 and 
Duda, Hart and Stork 2001 . 

Our permutation based neighborhood analysis method is a direct attempt to solve 
the multiple hypothesis problem by comparing the actual distribution of markers (i.e. 
neighbors of an ideal marker separating the classes) with a reference empirical 
distribution obtained by permuting the phenotype class labels. It is based on a 
standard global permutation test (Fisher 1935, Lehman 1986, Good 1994) of the 
phenotype levels keeping the gene correlation information. A histogram of scores for 
each of the marker genes of each permutation (neighborhood) is kept and the 
significance of an actual gene marker is obtained by finding the appropriate 
percentile in the histogram of the correspondingly ranked marker (i.e. the one with 
the same rank, e.g. best match, second best match etc.). This empirical distribution- 
free method is simple, intuitive and adapts itself to the correlation structure of the 
data providing higher statistical power. It minimizes the total number of false 
positives and uses the empirical reference distribution in a similar way as FDR- 
based and resampling methods do. Recently general MCP methods have been 
proposed to combine both resampling and control of the FDR (Yekuteli and 
Benjamini 1999). 

The application of permutation tests has also been introduced in the structural 
analysis of genetic linkage and detection of QTL (Quantitative Trait Linkage). In 
these methods (Churchill and Doerge 1994, Doerge and Churchill 1996) the traits 
are randomly permuted to create data sets that have random genotype-phenotype 
association. Those methods and ours are conceptually quite similar although in our 
case we consider expression "functional" rather than genotype data. 

After we introduced our method in Golub et al 1999 other methods have been 
introduced in the literature. For example the SAM method of Tusher et al 2001 is 
similar to ours but includes a user-adjustable threshold to provide estimates of the 
FDR. Dudoit et al 2001 have introduced a method based on step-down adjusted p- 
values using Westfall and Young's approach in the context of replicated cDNA 
experiments. Ideker et al 2000 used generalized likelihood tests to assess the 
statistical significance of differentially expressed genes in the context of two channel 
cDNA microarrays. Newton et al 2001 and Baldi and Long 2001 use empirical Bayes 
hierarchical models to assess significance of differential expression. Lee et al 2000 
combine the data from replicates to estimate posterior probabilities and identify 
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differentially expressed genes. No systematic comparison of the error rates and 
statistical power of all these different methods has been published yet. It will be 
interesting to develop a better understanding of the different trade offs between 
sensitivity and specificity, number of false positives vs. statistical power to guide the 
development of future analysis methodologies. 

Description of the permutation test-based neighborhood analysis method. 

Permutation test based (Golub et al 1999) neighborhood analysis is used to 
select and screen marker genes with respect to biologically meaningful 
phenotypes (morphology and treatment outcome) and to assess their statistical 
significance. To accomplish this we compare the top signal-to-noise scores of top 
marker genes with the corresponding ones from data obtained by randomly 
permuting the class labels. Typically 500 global random permutations were used, 
to build histograms. Based on these histograms we determined the 50% (median), 
5% and 1% significance levels and compared them with the values obtained for 
the real dataset. As described above this procedure is motivated by considering 
the following question: what is the likelihood that a given set of markers genes, for 
example selected by signal to noise, of a phenotype of interest represent chance 
correlations and not biologically significant matches? If one looks down the list of 
markers, how many should one consider as input to a classifier or for further 
study? In this list of selected markers what is the best way to minimize the 
number of false positives but retain enough sensitivity to select a non-empty set? 

In detail the permutation test procedure for a given comparison of interest (e.g. 
markers high in class 0 and low in class 1) is as follows: 

• Generate signal-to-noise (^i ass 0 - |i c i ass ^/(o C i aS s o + Ociass 1) scores for all genes 
that pass a variation filter using the actual class labels (phenotype) and sort 
them accordingly. The best match (k=1) is the gene "closer" or more 
correlated to the phenotype using the signal to noise as a correlation function. 
In fact one can imagine the reciprocal of the signal to noise as a "distance" 
between the "phenotype" and each gene as shown in the figure (see next 
page). One can also use a f-statistic (jiciass o - Hciass i)/Vciasso + cr class ^ and 
obtain very similar results. 

• Generate 500 or more random permutations of the class labels (phenotype). 
For each case of randomized class labels generate signal-to-noise scores and 
sort genes accordingly. 

• Build a histogram of signal to noise scores for each value of k. For example 
one for all the 500 top markers (k=1), another one for the 500 second best 
(k=2) etc. These histograms represent a reference statistic for the best match, 
second best, etc. and, for a given value of k, different genes contribute to it. 
Notice that the correlation structure of the data is preserved by this procedure. 
For each value of k, determine different percentiles (1%, 5%, 50% etc.) of the 
corresponding histogram. (See the bottom diagrams in the figure.) 

• Compare the actual signal to noise scores with the different significance levels 
obtained for the histograms of permuted class labels for each value of k. This 
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test helps to assess the statistical significance of gene markers in terms of the 
distribution of class-gene scores using permuted labels. 



In the results section the values for permutation tests of marker genes are 
reported in tables with this format: 



Distinction 


Distance 


Perm 1% 


Perm 5% 


Median 50% 


Feature 


Desc 


class 0 


0.96694607 


1.0144908 


0.8333578 


0.6280173 


M93119_at 


INSM1 Insulinoma-associated 1 


class 0 


0.9096911 


0.8600172 


0.7669801 


0.5740431 


M30448_s_at 


Casein kinase II beta subunit 


class 0 


0.90010124 


0.85051423 


0.7251496 


0.5494933 


S82240_at 


RhoE 


class 0 


0.832689 


0.84354156 


0.7071885 


0.5292253 


U44060_at 


Homeodomain protein {Prox 1 ) 


class 0 


0.83225346 


0.8009565 


0.68034023 


0.5169537 


D80004_at 


KIAA0182 gene 


class 1 


1.6520017 


0.9831643 


0.84544426 


0.6230137 


X86693_at 


High endothelial venule 


class 1 


1.2436218 


0.88150144 


0.7559189 


0.5795857 


M93426_at 


PTPRZ Protein tyrosine phosphatase 


class 1 


1.2317128 


0.86047184 


0.70928395 


0.5539352 


U48705jna1_s_at 


Receptor tyrosine kinase DDR gene 


class 1 


1.2259983 


0.8433512 


0.68909335 


0.5358038 


X86809_at 


Major astrocytic phospho protein PEA-15 


class 1 


1.214929 


0.8281318 


0.6849929 


0.5217813 


U45955_at 


Neuronal membrane glycoprotein M6b 


class 1 


1.2095517 


0.79365546 


0.6711517 


0.510208 


U53204_at 


Plectin (PLEC1) 



The distinction column represents the class for which the markers are high (low in 
the other classes). The Distance column is the signal to noise to the actual 
phenotype. The Perm. 1%, 5% and 50% columns represent the percentiles 
(significance levels) in the histograms of signal to noise scores for permuted labels 
for a given value of k. The Feature column is~the gene accession number and the 
Description column is the gene name. Permutation test results are reported in the 
gene markers sections: Multiple tumor class markers. Classic vs. desmoplastic 
MP markers , SOM-discovered CO vs. C1 class gene markers , and Treatment 
outcome markers . 

Neighborhood Analysis: Assessing Statistical Significance of 
Gene-Class Correlations 



Ideal Marker 
A B 



Actual Marker 
A B 




Actual Class Label Neighborhood Permuted Class Label Neighborhood 



5% Median 95% 



Freq. 



/c = 1 



Observed 



Density Distribution 
for Permuted Pattern 



Significant 
Neighbors 



5% median 95% 

Measure of Correlation 




Measure of Correlation 
(Signal to Noise) 
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Additional Notes: 

• This test helps to assess the significance of gene markers in terms 
of class-gene correlations but if a group of genes fails to pass the 
test that by itself does not necessarily imply that they cannot be used 
to build an effective classifier (Huberty 1994, Kearns and Vazirani 
1997). For example, in contrast with the case of morphological 
distinctions, for treatment outcome prediction the top marker genes 
do not show overwhelming statistical significance (they are "weak" 
markers) and yet they are effective when used in combination as 
input to a classifier. 

• The choice of the signal to noise distance is somewhat ad hoc but 
not unreasonable. The reason the signal to noise ratio was chosen 
instead of a £-statistic or other class distance measure was mainly 
historical and empirical: it performed slightly better in a previous 
study of gene expression (Leukemia) feature selection combined 
with a weighted voting classifier. 

• In terms of feature selection our approach can be considered a filter 
method based on signal to noise ratios but it is important to keep in 
mind that when the genes selected by this method are feed to a 
supervised classifier there is an additional number of genes selection 
process based on error rates (see the Algorithms and Permutation 
test for outcome predictor sections of this document). 

• The advantages of performing a permutation test are multiple: 

o It is a distribution-free, direct empirical method to test the 
significance of the matching of a given phenotype to a 
particular set of genes (dataset). 

o It does not assume a particular functional form for the 
distribution or correlation structure of genes. 

o As the permutation test is done on the entire distribution of 
genes (as scored by signal to noise from the phenotype) the 
gene-to-gene correlation structure is taken into account. 

• Another more geometrical, and sometimes more intuitive, way to 
look at this procedure is to consider the figure above as a 
hypothetical projection of normalized gene expression space where 
each dimension represents an experiment and each data point a 
gene. The entire dataset of filtered genes will be represented by a 
collection of data points distributed in that space. Each gene is 
represented by a point and the closer two points are, the more 
correlated they are (i.e. across the set of experiments being 
considered). Now imagine projecting a point that corresponds to an 
ideal marker gene that perfectly represents the phenotype of 
interest. This is for example a marker gene that is high in one of the 
classes and low in the other. This gene will be a perfect classifier to 
distinguish the two classes. We are interested in finding marker 
genes that are, if not equal, at least similar to this ideal marker. This 
can be accomplished by computing a distance or correlation 
measure between the class labels (phenotype) and the genes. In this 
sense we are looking at the "neighborhood" of a phenotype in gene 
expression space trying to find "close" neighbors. A permutation test 
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in this context is equivalent to moving the ideal gene marker point at 
random (as the labels are permuted) and obtaining a distribution of 
neighbors each time it lands to a new reference point (random 
phenotype) in expression space. By building a histogram of distance 
distributions to these random reference locations one can assess 
how "typical" the actual neighborhood of the actual phenotype is 
compared to random phenotypes. For example, if only once in a 
thousand random tries we found a set of top 10 markers as 
correlated as in the actual neighborhood, then we would consider 
those markers to be significant. In this interpretation, the permutation 
test resembles a spatial correlation Mantel test in which one 
measures the significance of finding excess "density" of neighbors 
(genes) around a point (ideal marker) that represents the phenotype 
of interest when compared with the density at random phenotype 
classes. 



Permutation Test for Outcome Predictor 

There is an additional permutation test (Fisher 1935, Lehman 1986, Good 1994) 
that was developed to assess the statistical significance of the /(-nearest neighbor 
predictor algorithm. In this test the phenotype (treatment outcome) labels are 
randomly permuted 1000 times and for each instance a set of models are build 
using the same set of parameters (e.g. k = number of neighbors, ng = number of 
features/genes) as the ones used in finding the actual model. Once this is done 
one selects the best error rate, from the results corresponding to the selected set 
of parameters, for each of these 1000 random predictors and makes a histogram. 
The significance of the predictor is assessed by the area of the histogram 
corresponding to random predictors with better error rates (see figure below). The 
results of this procedure for the k-nearest neighbor predictor of treatment outcome 
are reported in the Permutation test for k-nearest neighbor outcome predictor 
section. 



Model Parameters 



Permutation Test for Predictor 

Random Predictors (built on randomly permuted labels 



r 












Number 


Number Actual 


Random 


Random 


Random 


of Neigh 


Of Genes Model 


Model 1 


Model 2 


ModeMOOO 






Errors 


Errors 


Errors 


Errors 


3 


1 


20 


23 


19 


25 


3 


2 


14 


21 


34 


26 


5 


1 


18 


23 


27 


31 


5 


2 


16 


1° 


24 


28 ^ 



Best actual 
model Is 
selected 



Freq. 



Significance is 
assessed by 
comparing 
actual model 
with histogram 



Select best error rate of 
each model for histogram 



Density distribution of 
errors for predictors built 
on randomly permuted 
labels 



5% median 95% 
Number of Errors 
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Algorithms 

k-Nearest Neighbors (k-NN) 

We developed a weighted implementation of the /c-NN algorithm (Dasarathy 1991) 
that predicts the class of a new sample by calculating the Euclidean distance (d) of 
this sample to the k "nearest neighbor" standardized samples in "expression" 
space in the training set, and by selecting the predicted class to be that of the 
majority of the k samples (the method is defined in terms of Euclidean distances 
over standardized vectors so it is equivalent to using inner products: a . b / |a||b|).We 
performed the marker gene selection process by which we feed the /c-NN 
algorithm only the features with higher correlation with the target class. This 
feature selection is done by sorting the features according to the signal-to-noise 
statistic (Golub 1999, Slonim 2000) {^ciasso - ixciassi)/(°ciasso + cr c i a ss i)- In our 
version of the algorithm the weight of each of the k neighbors was weighted 
according to 1/d. For our medulloblastoma outcome experiments, the /c-NN 
models were evaluated by 60-fold leave-one-out cross-validation whereby a 
training set of 59 samples was used to predict the class of a randomly withheld 
sample. This was repeated for all samples and the cumulative error rate was 
recorded. Models with variable numbers of genes (1-200, selected according to 
their correlation with the survivor vs. treatment failure distinction in the training set) 
were tested in this manner. The detailed results of applying this algorithm to the 
different datasets can be found in the 

Multiple tumor classes predictions (k-NN) A Classic vs. desmoplastic MP and kz 
nearest neighbors treatment outcome prediction results sections. 

Weighted Voting. 

The weighted voting algorithm (Golub 1999, Slonim 2000) makes a weighted 
linear combination of relevant "marker" or "informative" genes obtained in the 
training set to provide a classification scheme for new samples. The selection of 
features (marker genes) is accomplished by computing the signal-to-noise statistic 
S x (described above). The class predictor is uniquely defined by the initial set of 
samples and marker genes. In addition to computing S x , the algorithm also finds 
the decision boundaries (half way) between the class means: b x = (/uctasso + 
Mciassi)l2 for each gene. To predict the class of a test sample y, each gene x in the 
feature set casts a vote: V x = S x {g x y - b x ) and the final vote for class 0 or 1 is sign 
(Z x V x ). The strength or confidence in the prediction of the winning class is (V win - 
Viose)/{V W in+Viose) (i.e., the relative margin of victory for the vote). The detailed 
prediction results are the Weighted voting treatment outcome prediction results. 

Support Vector Machines. 

The Support Vector Machine (SVM) for classification minimizes the generalization 
error rather than the training error. The basic idea behind SVMs is to construct an 
optimal separating hyperplane by mapping the gene expression data to a high- 
dimensional space (Mukherjee et al 1999, Brown et al 2000). Linear separation in 
this higher dimensional space corresponds to a nonlinear decision boundary in the 
original space. A new feature selection algorithm was developed to scale the input 
features to minimize the ratio of the radius around the support vectors and the 
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margin. The detailed results are in the SVM treatment outcome prediction results 
section. 

SPLASH. 

The Splash algorithm (Califano et al 1999) discovers efficiently and 
deterministically all statistically significant gene expression patterns in a target 
class of interest. Statistical significance is evaluated based on the probability of a 
"pattern," (i.e. a subset of genes and experiments within a narrow interval of 
expression values) to occur by chance in the control target class. A greedy set 
covering algorithms is used to select an optimal subset of statistically significant 
patterns. These patterns are accumulated and form the basis for a likelihood ratio 
classification scheme to predict new samples. The detailed results are in the 
SPLASH treatment outcome prediction results section. 

Predictors using metastatic staging and TrkC. 

These classifiers were constructed by finding the decision boundary halfway 
between the classes: {ju c iasso + Mciass i)/2 (using the staging values 0 vs. 1,2,3,4 or 
the continuous TrkC gene expression) and then predicting the unknown sample 
according to its gene expression value location with respect to that boundary. The 
detailed results can be found in the TrkC treatment outcome prediction results and 
Staging treatment outcome prediction results sections. 

Proportional chance criterion. 

In order to compute p-values for non-survival predictions, for example the p-val=4 
x10" 7 for the Classic vs. Desmoplastic classifier reported in the paper (33 out of 34 
samples correctly classified) we used a "proportional chance criterion" to evaluate 
the probability that a random predictor will produce a confusion matrix with the 
same row and column counts as the gene expression predictor. For example, for a 
binary class (A vs. B) problem, if a is the prior probability of a sample being in 
class A and p is the true proportion of samples in class A then C p = p a + (7-p) (1- 
a) is the proportion of the overall sample that is expected to receive correct 
classification by chance alone. Then if C m0 dei is the proportion of correct 
classifications achieved by the gene expression predictor one can estimate its 
significance by using a Z statistic of the form: (Cmodei- C p )/Sqrt(C p {1-C p )ln), where 
n is the total sample count. For more details see chapter VII of Huberty 1994. 

r Survival analysis and Kaplan-Meier plots 

The Kaplan-Meier survival analysis plots are computed using the S-Plus 
( http://www.insiqhtful.com/products/splus/) statistical software package: S-Plus 
2000, Guide to Statistics Volume 2, chapter 9. The p-values for the prediction of 
outcome groups are computed using a log-rank test (Mantel-Haenszel method, 
chapter 9 in the same reference). The Kaplan Meier plots and associated rank test 
p-values are included at the end of each of the outcome prediction sections 
starting in the k-nearest neighbors treatment outcome prediction results section. 



PCA and multidimensional-scaling of Brain tumor samples 
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Datasets of large dimensionality (i.e. large number of variables e.g. genes) are in 
general difficult to visualize due to the intrinsic difficulty of reducing and projecting 
the dataset to a small number of dimensions where standard visualization 
techniques are applicable. The main problem of performing a projection of that 
sort is that of preserving the "relevant" or "interesting" structure in the data. In our 
case this structure corresponds to the intrinsic similarities or the natural clustering 
of brain samples in the space of gene expression. 

A commonly used technique for data reduction, projection and visualization is 
Principal Component Analysis (PCA). In this approach one finds standardized 
linear combinations of variables, the "principal components/ which are orthogonal 
and explain all of the variance in the original dataset. For more details see for 
example ref. 3. A typical method to obtain a simple projection (multi-dimensional 
scaling) of the dataset is to plot the top 2 or 3 principal components, which may 
account for a significant fraction of the variance, in a 2 or 3D scatter plot. 
To study the natural clustering of the Brain tumor samples we performed PCA 
analysis and projected the top three components in 3D and 2D scatter plots (some 
shown in the paper as part of Figure 1 ). We considered two subsets of genes: 
highly varying, those with highest variation across samples that passed a variation 
filter (1,065 genes) and, marker genes, the top 10 marker genes of each tumor 
class by using the signal-to-noise statistics as described in the statistical analysis 
and prediction section. For the highest variation genes the values were 
thresholded to 100 from below and 16,000 from above and the variation filter 
selected genes with at least a 12-fold and 1,200 absolute units of variation 
between the minimum and maximum values across samples. This produced a 
subset of 1,065 highly varying genes. For the marker genes the values were 
thresholded to 20 from below and 16,000 from above and a variation filter selected 
genes with at least a 5-fold and 500 absolute units of variation between the 
minimum and maximum values across samples. The genes that passed this filter 
were ranked according to signal to noise (using medians) and the top 10 markers 
for each class were selected. This produced a total of 50 genes. 
Once the appropriate subset of highly varying or maker genes was selected we 
computed the 3 principal components using the S-Plus statistical software 
package using default settings (S-Plus statistical software package: S-Plus 2000, 
Guide to Statistics Volume 2, chapter 1 , 

http://www.splus.mathsoft.com/products/splus/splusintro.html). These three 
components were then plotted in 3D scatter plots. Figure 1 in the paper shows 
these plots for highly varying and marker genes where each type of brain fumor is 
shown in a different color. The plots show the 'natural" clustering of brain tumor 
samples in these two subspaces of gene expression. The components and plots 
can also be seen in the Multiple tumor PCA section. Besides the 2D and 3D plots 
of the top 3 components we also include bar graphs showing the relative 
importance of the top components and the loadings of the top 6 genes for each 
component. 

Combined classifiers. 

The fact that sometimes the prediction algorithms make mistakes in different 
samples and that the class structure of the confusion matrices is different for each 
algorithm motivated us to combine some of them to see if the predictions can be 
improved in this way. We choose a simple scheme combining three algorithms 
according to majority. For example if the outputs of the three algorithms for a given 



sample are Survivor, failure, and Survivor, then the output of the combined 
predictor will be Survivor. The results for two types of model combinations: using 
simple majority rule: Staging, /c-NN and TrkC and SVM, fr-NN and TrkC can be 
seen in the Combined treatment outcome predictors section. 
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Section II: datasets and clinical attributes 

The following sections of this document describe the samples, clinical attributes and 
datasets in detail. 



List of all samples 

Number Sample name Type 



Subtype 



Chang Sex Age at diagnosis Followup Current status Chemotherapy 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 

31 
32 
33 
34 
35 

'36 
37 

38 

39 

40 
41 
42 
43 



Brain_MD_1 

Brain_MD_2 

Brain_MD_3 

Brain_MD_4 

Brain_MD_5 

Brain_MD_6 

Brain_MD_7 

Brain_MD_8 

Brain_MD_9 

Brain_MD_10 

Brain_MD_1 1 

Brain_MD_12 

Brain_MDJ3 

Brain_MD_14 

Brain_MD_15 

Brain_MD_16 

Brain_MD_17 

Brain_MD_18 

Brain_MD_19 

Brain_MD_20 

Brain_MD_21 

Brain_MD_22 

Brain_MD_23 

Brain_MD_24 

BrainJV1D_25 

Brain_MD_26 

Brain_MD_27 

Brain_MD_28 

Brain_MD_29 

Brain_MD_30 

Brain_MD_3'1 
Brain_MD_32 
Brain_MD_33 
Brain_MDJ34 
Brain_MD_35 

Brain_MD_36 
Brain_MD_37 

Brain_MD_38 
Brain_MDJ39 

Brain_MD_40 
Brain_MD_41 
Brain_MD_42 
Brain MD 43 



Stage [years/months] [Months] [Alive/Dead] 


Meduiioblastoma 


Classic iT4M1 |M 


8m 111 [D [V.C.Cx.VP 


Medulloblastoma 


Classic fT2M0 M 


8yr10m £ ]d jv.C.Cx.VP 


Meduiioblastoma 


Classic hT3M0 


M 


6yr 


7 ID jV.C.Cx 


Medulloblastoma 


Classic T3M3 


M 


5yr 3m 


7 D V.C.Cx.VP 


Medulloblastoma 
Medulloblastoma 


Classic |M3 
Classic HT4M0 


M 
F 


38yr 2m 
7m 


7 . jp . kc 

9 jD jv.C.Cx 


Medulloblastoma Classic h"1M0 


M ;6yr 5m 


14 |D V.C.Cx 


Medulloblastoma Classic 


T3bM1 


M $yr 1 m 


16 Id "|v t c,cx 


Medulloblastoma Classic 


MO 


M 


8yr 


18 |D iV,C,Cx,VP 


Medulloblastoma Classic 


MO 


M 


3yr 10m 


78 |D V.C.Cx 


Medulloblastoma Classic 


T2M1 


M 


8yr 2m 


19 Id Iv.C.Cx.VP.Ca.T.M 


Medulloblastoma Classic MO F 3yr 9m 25 D V.C.Cx 


Medulloblastoma Classic p~3M3 


M h4yr 5m 26 


D - iV.C.Cx 


Medulloblastoma Desmoplastic ;M0 


M 


6yr 3m 


33 


D V,C,CC 


Medulloblastoma besmoplastic T2MO 


F 


11yr 7m 


38 


D |V,C t Cx,VP 


Medulloblastoma [Desmoplastic T3M3 


F 


11yr 5m 


39 


D iV.C.VP 


Medulloblastoma Classic T3bM3 


F 


3yr 3m 


39 


D V.C.Cx 


Medulloblastoma Classic H"2M3 


M 


4yr 4m 


42 


D )V,C,Cx 


Medulloblastoma Classic ;M2 


F 


26yr 1m 


65 


D iv.C.Cx.VP 


Medulloblastoma 


Classic 
Classic 
Desmoplastic 


T3bM0 


M 


20yr 6m 


9? . . . .. 


D V,C 


Medulloblastoma 


T2M0 


F 


23yr 3m 


102 _ 

24 


D y,C 


Meduiioblastoma 


MO 


F 


Syr 7m 


A jV.C.CC 


Medulloblastoma IDesmoplastic 


T4M0 


M 
M 


1yr4m 25 


A V.C.Cx 


Medulloblastoma Classic 


T3M0 


10yr 10m j27 


A iV,C,Cx 


" i ■■ ' ■ * 

Medulloblastoma Classic 


MO 


F 


5yr 4m 28 


A V.C.Cx.VP 


Medulloblastoma Classic [T2M3 


M 


1yr -33 


A |V, C ' Cx ' VP 


Medulloblastoma Classic IMO 


M 


5yr10m £4 


A V.C.Cx 


Medulloblastoma besmoplastic T4M0 M 6yr1m 35 A V.C.Cx 


Medulloblastoma Classic T3M0 IF 
Medulloblastoma IDesmoplastic JT3M0 |f 

Medulloblastoma Classic MO JM 


7yr5m 
11yr 9m 

7yr 4m 


35 

36 

39 


A jv.C.Cx 
A jV.C.Cx 

a Iv.c.cx 


Medulloblastoma IDesmoplastic (T2M0 |m 


10yr 11m 


39 


A jV.C.Cx 


Medulloblastoma Classic T3bM0 |M 


12yr 9m 


41 


a y.ccx 


Medulloblastoma Classic p*3M1 |M 


8yr 2m 


42 


a y.ccx 


Medulloblastoma IDesmoplastic T3M0 [F 


2yr 3m 


45 


a y.c.cx 


r -— - 

Medulloblastoma 


Classic T3M0 lM 


Syr 6m 


46 


a y,c,cx 


Medulloblastoma 


Classic iT3M0 If 


12yr7m f 51 


A iv.C.Cx 


Medulloblastoma 


Desmoplastic 


T3M1 |F 


7m -52 


a y,c,cx 


Medulloblastoma 


Classic 


T3M0 M 


. — j — ~ 

10yr9m |53 


A Sv.C.Cx 


Medulloblastoma 


Desmoplastic 


T4M3 iM 


3yr 4m ;57 


1 

a y.ccx 


Medulloblastoma 


Classic [T4M0 [F 


4yr 8m feO 


a y.c.cxyp 


Medulloblastoma Iciassic [T3M3 [m 


6yr 62 


a y,c,cx,vp 


Medulloblastoma Classic T3M0 SM 


9yr 3m 34 


a y.c.cx 
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44 


Brain_MD_44 


Medulloblastoma 


Classic 


T3M0 


Im 


:5yr 3m 


\ 

^ 


;a. 


V.C.Cx 


45 


Brain_MD_45 


i — 

Medulloblastoma 


Classic 


T4M0 


M 


3yr 6m 


,68 


A 


y,c,cx t p 


46 


Brain JVID_46 


Medulloblastoma 


Classic 


T3M0 


M . 


2yr 4m 


68 


A 


fV.U.UX 


47 


Brain_MD_47 


iMedulloblastoma 


Classic 


T4M0 


F 


;1 Oyr 6m 




A 


V.C.Cx 


48 


Brain_MD_48 


Medulloblastoma 


Classic 


T3bM0 


M 


5yr 5m 


Z2„ _ 


A 


iV.cCx.VP.Ca 


49 


Brain_MD_49 


iMedulloblastoma 


Classic 


T2M0 


If 


1 2yr 11 m 


74 


A 


V.CCx 


50 


Brain_MD_50 


iMedulloblastoma 


Classic 


T3bM0 


m 


9yr 11m 


79 


A 


V.C.Cx 


51 


Brain_lViD_5l 


'Medulloblastoma 


Classic 


1 oDMU 


it j 

Wl 


1 3yr 8m 


ya 


!a 


V,C,Cx 


52 


Drain_MLj_52 


i _ — 

MedulloDlastoma 


Classic 


1 £.Nl\J 


M 


1yr 8m 




A 


V.C.Cx 


53 


Brain_MD_53 


'Medulloblastoma 


Desmoplastic 


T2M0 


F 


5yr 2m 


84 


A 


V.C.Cx 


54 


Brain_MD_54 


^Medulloblastoma 


Classic 


T4M4 


r 

F 


1yr 5m 


85 


A 


V,C,Cx,VP,Ca,T, 


55 


Brain_MD_55 


iMedulloblastoma 


Classic 


T3bM2 


M 


10yr 4m 




87 


f- — - - 

A 


V.C.Cx.VP 


56 


Brain_MD_56 


iMedulloblastoma 


Desmoplastic 


T2M0 


F 


28yr 


17 . 


A V,C 


57 


Brain_MD_57 




Classic 


T2M3 


M 


2yr7m 


97 


A 


V.C.Cx 


58 


Brain_MD_58 


IMedulloblastoma 


Classic 


T1M0 


M 


3yr7m 


108 


A 


V,C,Cx,VP 


59 


Brain_MD_59 


! " " • • • • 

Medulloblastoma 


Classic 


T3bM0 


M 


9yr 9m 


130 


A 


v.c 


60 


Brain_MD_60 


i 

IMedulloblastoma 


Desmoplastic 


T3M0 


F 


2yr 


24 


A 


V.C.Cx 


61 


Brain_MD_61 


iMedulloblastoma 
















62 


Brain_MD_62 


Medulloblastoma 














V.C.Cx 


63 


Brain_MD_63 


Medulloblastoma 
















64 


Brain_MD_64 


Medulloblastoma 














V.C.Cx 


65 


Brain_MD_65 


? 

iMedulloblastoma 














V.C.Cx 


66 


Brain_MD_66 


jMedullqblastoma 












K c 


67 


Brain_MD_67 


iMedulloblastoma 












y,c,cx,vp 


68 


BrainJvlGliojl 


Malignant Glioma 
















69 


Brain_MG!io_2 


Malignant Glioma 














V= vincristine 


70 


Brain_MGIio_3 


Malignant Glioma 














C= cispiatin 


71 


Brain_MGIio_4 


Malignant Glioma 
Malignant Glioma 














Cx= Cytoxan 


72 


Brain_MGIio_5 














VP= etoposide 


73 


Brain_MGIio_6 


Malignant Glioma 














CC= CCNU 


74 


Brain_MGIio_7 


Malignant Glioma 














Ca= carboplatin 


75 


Brain_MGIio_8 


Malignant Glioma 














P= procarbazine 


76 


Brain_MGIio_9 


Malignant Glioma 














M= methotrexate 


77 


Brain _MGIio_10 


Malignant Glioma 














T= thiotepa 


78 


Brain_Rhab_1 


AT/RT (Brain) 
















79 


Brain_Rhab_2 


AT/RT (Renal) 
















80 


Brain_Rhab_3 


AT/RT (Renal) 
















81 


Brain_Rhab_4 


AT/RT (Brain) 
















82 


Brain_Rhab__5 


AT/RT (Extra Renal) 
















83 


Brain_Rhab_6 


AT/RT (Extra Renal) 
















84 


Brain_Rhab__7 


AT/RT (Renal) 
















85 


Brain_Rhab_8 


AT/RT (Brain) 








r 
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Brain_Rhab_9 


AT/RT (Brain) 
















87 


Brain_Rhab_10 


AT/RT (Brain) 
















88 
89 


Brain_Ncer_1 
Brain_Ncer_2 


Normal cerebellum 
Normal cerebellum 
















90 


Brain_Ncer_3 


Normal cerebellum 
















91 


Brain_Ncer_4 


Normal cerebellum 
















92 


Brain_PNET_1 


PNET 
















93 


Brain_PNET_2 


PNET 
















94 


Brain_PNET_3 


PNET 
















95 
96 


Brain_PNET_4 
Brain_PNET_5 


PNET 
PNET 
















97 


Brain_PNET_6 


PNET 
















98 
99 


Brain_PNET_7 
Brain_PNET_8 


PNET (pineoblastoma) 
PNET (pineoblastoma) 
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Dataset A, A1 , A2 - multiple tumor samples 

Dataset A: 10 medulloblastomas, 10 malignant gliomas, 10 AT/RT (5 CNS, 5 renal- 
extrarenal), 4 normal cerebellums and 8 supratentorial PNETs. 

Two of the supratentorial PNETs are pineoblastomas, which historically have been 
inconsistently included in the PNET category. The analysis was repeated excluding these 2 
pineoblastomas. 

Dataset A1: 10 medulloblastomas, 10 malignant gliomas, 10 AT/RT (5 CNS, 5 renal- 
extrarenal), 4 normal cerebellums and 6 supratentorial PNETs. 

To test whether inclusion of a larger number of medulloblastomas might lessen the 
distinctions noted in Dataset A, 50 more medulloblastoma samples were added and the 
PCA analysis repeated. 

Dataset A2: 60 medulloblastomas, 10 malignant gliomas, 10 AT/RT (5 CNS, 5 renal- 
extrarenal), 4 normal cerebellums and 6 supratentorial PNETs. 



Dataset A 



Sample number 


Sample name 


Type 


1 


Brain_MD_12 


Medulloblastoma 


2 


Brain_MD_61 


Medulloblastoma 


3 


Brain_MDJ5 


Medulloblastoma 


4 


Brain_MD_57 


Medulloblastoma 


5 


Brain_MD_33 


Medulloblastoma 


6 


Brain_MD_64 


Medulloblastoma 


7 


Brain_MDJ7 


Medulloblastoma 


8 


Brain_MD_62 


Medulloblastoma 


9 


Brain_MD_63 


Medulloblastoma 


10 


Brain_MD_32 


Medulloblastoma 


11 


Brain_MGIio_1 


Malignant Glioma 


12 


Brain_MGIio_2 


Malignant Glioma 


13 


Brain_MGIio_3 


Malignant Glioma 


14 


Brain_MGIio_4 


Malignant Glioma 


15 


Brain_MGIio_5 


Malignant Glioma 


16 


Brain_MG!io_6 


Malignant Glioma 


17 


Brain_MGIio_7 


Malignant Glioma 


18 


Brain_MGlio_8 


Malignant Glioma 


19 


Brain_MGIio_9 


Malignant Glioma 


20 


Brain_MGIio_10 


Malignant Glioma 


21 


Brain_Rhab_1 


AT/RT (Brain) 


22 


Brain_Rhab_2 


AT/RT (Renal) 


23 


Brain_Rhab_3 


AT/RT (Renal) 


24 


Brain_Rhab_4 


AT/RT (Brain) 


25 


Brain_Rhab_5 


AT/RT (Extra Renal) 


26 


Brain_Rhab_6 


AT/RT (Extra Renal) 


27 


Brain_Rhab_7 


AT/RT (Renal) 



28 Brain_Rhab_8 AT/RT (Brain) 

29 Brain_Rhab_9 AT/RT (Brain) 

30 Brain_Rhab_10 AT/RT (Brain) 

31 Brain_Ncer_1 Normal cerebellum 

32 Brain_Ncer_2 Normal cerebellum 

33 Brain_Ncer_3 Normal cerebellum 

34 Brain_Ncer_4 Normal cerebellum 

35 Brain_PNET_1 PNET 

36 Brain_PNET_2 PNET 

37 Brain_PNET_3 PNET 

38 Brain_PNET4 PNET 

39 Brain_PNET_5 PNET 

40 Brain_PNET_6 PNET 

41 Brain_PNET_7 PNET (pineoblastoma) 

42 Brain_PNET_8 PNET (pineoblastoma) 



Dataset A1 



Sample number 


Sample name 


Type 


1 


BrainJV1D_12 


Medulloblastoma 


2 


Brain_MD_61 


Medulloblastoma 


3 


Brain_MD_15 


Medulloblastoma 


4 


Brain_MD_57 


Medulloblastoma 


5 


Brain_MD_33 


Medulloblastoma 


6 


Brain_MD_64 


Medulloblastoma 


7 


Brain_MD_17 


Medulloblastoma 


8 


Brain_MD_62 


Medulloblastoma 


9 


Brain_MD_63 


Medulloblastoma 


10 


Brain_MD_32 


Medulloblastoma 


11 


Brain_MGIio_1 


Malignant Glioma 


12 


Brain_MGIio_2 


Malignant Glioma 


13 


Brain_MGIio_3 


Malignant Glioma- 


14 


Brain_MGIio_4 


Malignant Glioma 


15 


Brain_MGIio_5 


Malignant Glioma 


16 


Brain_MGIio_6 


Malignant Glioma 


17 


Brain_MGIio_7 


Malignant Glioma 


18 


Brain_MGIio_8 


Malignant Glioma 


19 


Brain_MGlio_9 


Malignant Glioma 


20 


Brain_MGIio_10 


Malignant Glioma 


21 


Brain_Rhab_1 


AT/RT (Brain) 


22 


Brain_Rhab_2 


AT/RT (Renal) 


23 


Brain_RhabJ3 


AT/RT (Renal) 


24 


Brain_Rhab_4 


AT/RT (Brain) 


25 


Brain_Rhab_5 


AT/RT (Extra Renal) 


26 


Brain_Rhab_6 


AT/RT (Extra Renal) 


27 


Brain_Rhab_7 


AT/RT (Renal) 


28 


Brain_Rhab_8 


AT/RT (Brain) 


29 


Brain_Rhab_9 


AT/RT (Brain) 


30 


Brain_Rhab_10 


AT/RT (Brain) 


31 


Brain_NceM 


Normal cerebellum 


32 


Brain_Ncer_2 


Normal cerebellum 


33 


Brain_Ncer_3 


Normal cerebellum 


34 


Brain_Ncer_4 


Normal cerebellum 


35 


Brain_PNET_1 


PNET 


36 


Brain_PNET_2 


PNET 



37 ' Brain_PNET_3 PNET 

38 Brain_PNET_4 PNET 

39 Brain_PNET_5 PNET 

40 Brain_PNET_6 PNET 



Dataset A2 



Sample number 


Sample name 


Type 


1 


BrainJVIDJ 


Medulloblastoma 


2 


Brain_MD_2 


Meduiloblastoma 


3 


Brain_MD_3 


Medulloblastoma 


4 


Brain_MD_4 


Medulloblastoma 


5 


Brain_MD_5 


Medulloblastoma 


6 


Brain_MD_6 


Medulloblastoma 


7 


Brain_MD_7 


Medulloblastoma 


8 


Brain_MD_8 


Medulloblastoma 


9 


Brain_MD_9 


Medulloblastoma 


10 


Brain_MD_10 


Medulloblastoma 


11 


Brain_MD_1 1 


Medulloblastoma 


12 


Brain_MD_12 


Medulloblastoma 


13 


Brain_MD_13 


Medulloblastoma 


14 


Brain_MD_14 


Medulloblastoma 


15 


BrainJVID_15 


Medulloblastoma 


16 


Brain MD 16 


Medulloblastoma 


17 


Brain_MD_17 


Medulloblastoma 


18 


Brain MD 18 


Medulloblastoma 


19 


Brain_MD_19 


Medulloblastoma 


20 


Brain MD 20 


Medulloblastoma 


21 


Brain MD 21 


Medulloblastoma 


22 


Brain MD 22 


Medulloblastoma 


23 


Brain MD 23 


Medulloblastoma 


24 


Brain MD 24 


Medulloblastoma 


25 


Brain MD 25 


Medulloblastoma 


26 


Brain MD 26 


Medulloblastoma 


27 


Brain MD 27 


Medulloblastoma 


28 


Brain MD 28 . 


Medulloblastoma 


29 


Brain MD 29 


Meduiloblastoma 


30 


Brain_MD_30 


Medulloblastoma 


31 


Brain MD 31 


Medulloblastoma 


32 


Brain MD 32 


Meduiloblastoma 


33 


Brain MD 33 


Medulloblastoma 


34 


Brain MD 34 


Medulloblastoma 


35 


Brain MD 35 


Medulloblastoma 


36 


Brain MD 36 


Medulloblastoma 


37 


Brain MD 37 


Medulloblastoma 


38 


Brain MD 38 


Medulloblastoma 


39 


Brain MD 39 


Medulloblastoma 


40 


Brain MD 40 


Medulloblastoma 


41 


Brain MD 41 


Medulloblastoma 
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42 


Brain_MD_42 


Medulloblastoma 


43 


Brain_MD_43 


Medullobiastoma 


44 


Brain MD 44 


Medulloblastoma 


45 


Brain_MD_45 


Medulloblastoma 


46 


Brain_MD_46 


Medulloblastoma 


47 


Brain_MD_47 


Medulloblastoma 


48 


Brain_MD_48 , 


Medulloblastoma 


49 


Brain MD 49 


Medulloblastoma 


50. 


Brain_MD_50 


Medulloblastoma 


51 


Brain MD 51 


Medulloblastoma 


52 


Brain MD 52 


Medulloblastoma 


53 


Brain_MD_53 


Medulloblastoma 


54 


Brain_MD_54 


Medulloblastoma 


55 


Brain MD 55 


Medulloblastoma 


56 


Brain MD 56 


Medulloblastoma 


57 


Brain MD 57 


Medulloblastoma 


58 


Brain MD 58 


Medulloblastoma 


59 


Brain_MD_59 


Medulloblastoma 


60 


Brain MD 60 


Medulloblastoma 


61 


Brain_MGIio_1 


Malignant Glioma 


62 


Brain_MG!io_2 


Malignant Glioma 


63 


Brain_MGIio_3 


Malignant Glioma 


64 


Brain_MGIio_4 


Malignant Glioma 


65 


Brain_MGIio_5 


Malignant Glioma 


66 


Brain_MGIio_6 


Malignant Glioma 


67 


Brain_MGIio_7 


Malignant Glioma 


68 


Brain_MGIio_8 


Malignant Glioma 


69 


Brain_MGiio_9 


Malignant Glioma 


70 


Brain_MGIio_10 


Malignant Glioma 


71 


Brain Rhab 1 


AT/RT (Brain) 


72 


Brain Rhab 2 


AT/RT (Renal) 


73 


Brain Rhab 3 


AT/RT (Renal) 


74 


Brain Rhab 4 


AT/RT (Brain) 


75 


Brain Rhab 5 


AT/RT (Extra Renal) 


76 


Brain Rhab 6 


AT/RT (Extra Renal) 


77 


Brain Rhab 7 


AT/RT (Renal) 


78 


Brain_Rhab_8 


AT/RT (Brain) 


79 


Brain Rhab 9 


AT/RT (Brain) 


80 


Brain_Rhab_10 


AT/RT (Brain) 


81 


Brain Ncer 1 


Mormal cerebellum 


82 


Brain Ncer 2 


Mormal cerebellum 


83 


Brain_Ncer_3 


Normal cerebellum 


84 


Brain_Ncer_4 


Normal cerebellum 


85 


BrainJ>NET_1 


PNET 


86 


Brain_PNET_2 


PNET 


87 


Brain_PNET_3 


PNET 


88 


Brain_PNET_4 


PNET 


89 


Brain_PNET_5 


PNET 


90 


Brain_PNET_6 


PNET 



Dataset B - MD classic-desmoplastic 



Dataset B: 


25 classic and 9 desmoplastic medulloblastomas. 


Number 


Sample name 


Type 


Subtype 


1 


Brain_MD_7 


Medulloblastoma 


Classic 


2 


Brain_MD_59 


Medulloblastoma 


Classic 


3 


Brain_MD_20 


Medulloblastoma 


Classic 


4 


Brain_MD_21 


Medulloblastoma 


Classic 


5 


Brain_MD_50 


Medulloblastoma 


Classic 


6 


Brain_MD_49 


Medulloblastoma 


Classic 


7 


Brain_MD_45 


Medulloblastoma 


Classic 


8 


Brain_MD_43 


Medulloblastoma 


Classic 


9 


Brain_MD_8 


Medulloblastoma 


- Classic 


10 


Brain_MD_42 


Medulloblastoma 


Classic 


11 


Brain_MDJ 


Medulloblastoma 


Classic 


12 


Brain_MD_4 


Medulloblastoma 


Classic 


13 


Brain_MD_55 


Medulloblastoma 


Classic 


14 


Brain_MD_41 


.Medulloblastoma 


Classic 


15 


Brain_MD_37 


Medulloblastoma 


Classic 


16 


Brain_MD_3 


Medulloblastoma 


Classic 


17 


Brain_MD_34 


. Medulloblastoma 


Classic 


18 


Brain_MD_29 


-Medulloblastoma 


Classic 


19 


Brain_MD_13 


Medulloblastoma 


Classic 


20 


Brain_MD_24 


Medulloblastoma 


Classic 


21 


Brain_MD_65 


Medulloblastoma 


Classic 


22 


Brain_MD_5 


Medulloblastoma 


Classic 


23 


Brain_MD_66 


Medulloblastoma 


Classic 


24 


Brain_MD_67 


Medulloblastoma 


Classic 


25 


Brain_MD_58 


Medulloblastoma 


Classic 


26 


Brain_MD_53 


Medulloblastoma 


Desmoplastic 


27 


Brain_MD_56 


Medulloblastoma 


Desmoplastic 


28 


Brain_MD_16 


Medulloblastoma 


Desmoplastic 


29 


Brain_MD_40 - 


Medulloblastoma 


Desmoplastic 


30 


Brain_MD_35 


Medulloblastoma 


Desmoplastic 


31 


Brain_MD_30 


Medulloblastoma 


Desmoplastic 


32 


Brain_MD_23 


Medulloblastoma 


Desmoplastic 


33 


Brain_MD_28 


Medulloblastoma 


Desmoplastic 


34 


Brain_MD_60 


Medulloblastoma 


Desmoplastic 
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Dataset C - MD outcome 

Dataset C: 39 medulloblastomas survivors and 21 treatment failures (non- 
survivors) 



Number Sample name 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 
39 
40 
41 
42 
43 
44 
45 
46 
47 
48 



Brain_ 
Brain_ 
Brain_ 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 

Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain, 
Brain. 
Brain. 
Brain 



MD_1 

MD_2 

MD_3 

MD_4 

MD5 

MD_6 

MD_7 

MD_8 

MD_9 

MD_10 

MD_11 

MDJ2 

MDJ3 

MD_14 

MD_15 

MD_16 

MD_17 

MDJ8 

MD_19 

MD_20 

MD_21 

MD_22 

MD_23 

MD_24 

MD_25 

MD_26 

MD_27 

MD_28 

MD_29 

MD_30 

MD_31 

MD_32 

MD_33 

MD_34 

MD_35 

MD_36 

MD_37 

MD_38 
MD_39 
MD_40 
MD_41 
MD_42 
MD_43 
MD_44 
MD_45 
MD_46 
MD_47 
MD 48 



Type Subtype Chang Sex Age at diagnosis Followup Current status 

Stage [years/months] [Months] [Alive/Dead] 


Medulloblastoma 'Classic fT4M1 


M !8m 


11 Id 


Medulloblastoma blassic |T2M0 


M |8yr10m 


5 Id 


Medulloblastoma jClassic 


T3M0 


M ;6yr 


7 b ~ " 


Medulloblastoma (Classic 


T3M3 


M |5yr 3m 


7 Id 


Medulloblastoma [Classic 


M3 


M !38yr 2m 


7 Id 


Medulloblastoma [Classic 


T4M0 


F l7m 


9 b 


Medulloblastoma blassic 


T1M0 


M Syr 5m 


14 ID 


Medulloblastoma [Classic 


T3bM1 


M 6yr1m 


16 |D 


Medulloblastoma jClassic 


MO 


M ;8yr 


18 |D 


Medulloblastoma Classic 


MO 


M ;3yr 10m 


18 |D 


Medulloblastoma blassic 


T2M1 


M |8yr 2 m 


19 \D 


Medulloblastoma blassic MO F 3yr9m 25 D 


Medulloblastoma fclassic IT3M3 


M h4yr 5m : 26 


D 


Medulloblastoma pesmoplastic |M0 


M 


6yr 3m ;33 


D 


Medulloblastoma jDesmoplastic ?T2MO 


F 


11yr 7m 


38 


D 


Medulloblastoma pesmoplastic ;T3M3 


F 


11yr 5m 


39 


D 


Medulloblastoma [Classic fT3bM3 


F 


3yr 3m 


39 


D 


Medulloblastoma 'Classic jT2M3 


M 


4yr4m 


42 


D 


Medulloblastoma Classic 


M2 


F 26yr 1m 65 


D 


Medulloblastoma blassic 


T3bM0 


M |20yr 6m 92 


D 


Medulloblastoma jClassic 


T2M0 


F 

F 
M 


23yr 3m 


102 


D 

a i.r_i_ 

A 


Medulloblastoma jDesmoplastic 
Medulloblastoma jDesmoplastic 


M _o_ 

T4M0 


5yr 7m 
1yr 4m 


24 

25 


Medulloblastoma 'Classic ;T3M0 |M 


10yr 10m 


27 


A 


Medulloblastoma blassic 


MO |F 


Syr 4m 


28 


A 


Medulloblastoma fclassic 


T2M3 |M 


1yr 


33 


A 


Medulloblastoma jClassic 


MO M 


5yr 10m 


34 


A _ _„.J 


Medulloblastoma besmoplastic T4M0 M 6yr 1m 35 A 


Medulloblastoma jClassic iJ3M0 
Medulloblastoma besmoplastic IT3M0 


F 
F 


7yr5m_ 

11yr 9m 


35 „ 

36 


A_ _ _ 

A 


Medulloblastoma 'Classic !M0 


M 


7yr 4m !39 


A 


Medulloblastoma jDesmoplastic IT2M0 


M 


10yr11m 39 


A 


Medulloblastoma Iciassic fT3bM0 


M 


12yr9m kl " 


A .... | 


Medulloblastoma Classic ;T3M1 


M 


8yr 2m 42 


A 


Medulloblastoma pesmoplastic HT3M0 


F 


2yr 3m 45 


A 


Medulloblastoma blassic ; J3M0 


M 


5yr 6m 46 


A 


j " 1 

Medulloblastoma' ICIassic T3M0 


F 


12yr 7m 151 


A 


[ 1 
Medulloblastoma pesmoplastic fT3M1 


F. 


7m |52 


A 


Medulloblastoma jClassic [T3M0 


M 


10yr9m :53 


A 


Medulloblastoma besmoplastic T4M3 


M 


3yr 4m 57 


A 


Medulloblastoma [Classic jT4M0 


F |4yr 8m feo 


A 


Medulloblastoma jClassic fT3M3 


M 6yr 62 


A j 


Medulloblastoma [Classic fTCMO 


M :9yr 3m £4 


A 


Medulloblastoma [Classic iT3M0 


M :5yr 3m 66 


A 


Medulloblastoma blassic iT4M0 


M : 3yr 6m fe8 


A 


Medulloblastoma jClassic [T3M0 


M (2yr 4m 68 |a 


Medulloblastoma jClassic T4M0 


F 


10yr 6m !70 |a 


Medulloblastoma blassic ;T3bM0 


M 


5yr 5m !72 (A 



Chemotherapy 



y.c.cx 
kc,cx 



V,C,Cx 
V.CCx 



V.C.Cx 



V,C,Cx,VP,Ca 
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49 


Brain MD 49 


Medulloblastoma [Classic 


fT2M0 


F Il2yr11m 


74 


A 


kccx 


50 


Brain MD 50 




Medulloblastoma 


Classic 


lT3bM0 


M Syr 11m 


79 


A 


jV.C.Cx 


51 


Brain MD 51 


Medulloblastoma 


Classic 


jT3bM0 


M 


13yr 8m 


79 


A jV.C.Cx 


52 


Brain MD 52 


Medulloblastoma 


Classic 


|T2M0 


M 


1yr 8m 


80 


A 


A/.C.Cx 


53 


Brain MD 53 


Medulloblastoma 


Desmoplastic h*2M0 


F 


5yr 2m 


o4 


A 


jV.C.Cx 


54 


Brain_MD_54 


Medulloblastoma 


Classic 


•T4M4 


F 


1yr 5m 


85 


A 


jV.C.Cx.VP.Ca.T.M 


55 


Brain_MD_55 


Medulloblastoma 


Classic 


jT3bM2 


M 


10yr 4m 


B7 


A 


jv,C,Cx f VP 


56 


Brain_MD_56 


Medulloblastoma 


Desmoplastic 


|T2M0 


F 


28yr 


87 


A 


}V,C 


57 


Brain_MD_57 


Medulloblastoma 


Classic 


ST2M3 


M 


2yr 7m 


97 


A 


|V,C,Cx 


58 


Brain_MD_58 


Medulloblastoma 


Classic 


|T1M0 


M 


3yr 7m 


108 


A 


_jV,C,Cx,VP 


59 


Brain_MD_59 


Medulloblastoma 


Classic 


jT3bM0 


M 


9yr 9m 


130 


A 


|v,c 


60 


Brain MD 60 


Medulloblastoma 


l ■■ s 

Desmoplastic ;T3M0 


F 2yr 


24 


A 


iV.C.Cx 
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Section III: detailed analysis results 

This section presents the results of applying the methods of section I to the . 
datasets of section II. A brief comment precedes each table of results. 



Multiple tumor PCA 

This section contains the PCA projections of highly varying and marker genes for 
datasets A, A1 and A2. The genes were filtered as described in the 
PCA and multidimensional-scaling of Brain tumor samples section. 



Dataset A (42 samples) - highly varying genes 




Highly varying genes were selected by using a stringent variation filter (see 
parameters in table below). The plot above shows the projection of the first 3 
components. Notice the relative clustering of tumor samples according to tissue 
type. The MD samples cluster tightly while the PNET and M. Glio. appear to 
scatter much more. The AR/RT renal/extra-renal and CNS varieties cluster closer 
to each other much more than to other types. 
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The next two plots show 2D projections of the first vs. second and second vs. third 
components. 
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The next bar graph shows the relative importance of the first components. The first 
three components account for 42.5% of the variance of the highly varying genes. 



Relative Importance of Principal Components 




Corrp 1 Carp 2 Corrp. 3 Corrp. 4 Carp. 5 Comp. 6 Comp. 7 Conrp. 8 Corrp. 9Ccrrp. 10 
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The bar graph below shows the contribution of the top 6 genes for each of the 
three principal components. Notice the almost equal weight given to multiple 
genes. 



s: 

o 



o 
o 



3 

o 

o 
o 



s. 



HG3355-HT3532 
Peroxisome Profit. Act. Rec. 



M96859 
DPP6 



D42087 

KIAA0118gene 



140397 
clone S31i125 



01 4838 
FGF9 



Comp. 1 




Z25884 U06681 
CC-1 muscle chloride channel Clone CCA 12 



U3 1903.5 

CREB-RP (creb-rp) mRNA 



L08010 

Regenerating protein I b 



Comp. 2 




L13977 S80562 

LYSO. PRO-X CARB. PREC. CNN3 Calponin 3 



L 19711 

Dystroglycan (DAG1 ) mRNA 



X67698 

Tissue specific mRNA 



Comp. 3 




M29551 

SER7THRE0. PHOS. 2B 



M31642 
HPRT1 



M80359 HG2036-HT2090 LI 0910 

SERJTHREON.-KINASE P78 Stim.Gdp/Gtp Ex.C-Ki-Ras P21 Splicing factor (CC1.3) 



PCA of Multiple Tumor Samples 
Dataset A 

Part I: Genes with High Variation: 

Values thresholded to 100 from below and 16000 from above 
Variation filter: max/min > 12 (12-fold), max-min= 1200 absolute units 
Number of features (genes) = 1065 

PCA Components 
Sample class C1 C2 C3 



Brai 


n 


MD 12 


0 


-11.5892 


-6.03135 


0.440673 


Brai 


n 


MD 61 


0 


-2.4498 


-12.3523 


0.541948 


Brai 


n 


MD 15 


0 


-4.50062 


-12.0282 


0.534799 


Brai 


p 


MD 57 


0 


-8.18619 


-13.4928 


0.255259 


Brai 


n 


MD 33 


0 


-4.96798 


-14.8525 


6.25611 


Bra 


p 


MD 64 


0 


-6.47181 


-7.76031 


-2.60217 


Bra 


p 


MD 17 


0 


-8.49966 


-11.5705 


0.990409 


Rra 


n 
i 


MD 62 


0 


9.224503 


-0.09585 


-25.1327 


Bra 


p 


MD 63 


0 


-10.4064 


-13.4434 


5.931149 


Bra 


ri 


MD 32 


0 


-3.99145 


-11.8767 


0.990155 


Rra 

LJI a 


n 

1 1 


MGlio 1 


1 


9.679457 


-2.95242 


8.168946 


Rrai 

□ 1 Cll 


n 


MGlio 2 


1 


30.79565 


18.83608 


14.74503 


Brai 

I— > 1 CJ 


p 


MGlio 3 

I VI \J 1 1 VJ w 


1 


23.58435 


12.43997 


11.68188 


Bra' 


n 
1 1_ 


MGlio 4 


1 


12.51082 


5.673459 


2.789562 


Bra 

LJI O 


n 
1 1 


MGiio 5 

1 V 1 V»J 1 1 \J \J 


1 


7.913009 


9.232989 


11.14407 


Bra 

LJI CI 


p 


MGlio 6 


1 


0.940745 


7.517246 


4.158792 


Bra 


p 


MGlio 7 

1 VI VJ II VJ / 


1 


0.124103 


10.39103 


7.501235 


Bra 


p 


MGlio 8 


1 


-11.4252 


8.928279 


7.558252 


Bra 


p 


MGlio 9 


1 


24.17954 


14.10684 


5.865164 


Rra 


p 


MGlio 10 


1 


16.90079 


1 .072724 


12.49571 


Bra 

LJI CI 


p 


Rhab 1 


2 


-22.8312 


20.49559 


-18.1079 


Bra 


p 


Rhab 2 


3 


-16.4952 


0.282439 


-7.79654 


Bra 

LJI CI 


p 


Rhab 3 


" 3 


-20.2866 


15.65098 


-10.1592 


Bra 

LJI CI 


p 


Rhab 4 


2 


-15.4183 


-0.91765 


6.466784 


Bra 

LJI CI 


p 


Rhab 5 


3 


-19.1736 


4.851007 


-3.92123 


Bra 


p 


Rhab 6 


3 


-18.5824 


24.52529 


-10.7888 


Rra 

LJI CJ 


n 
1 1 


Rhab 7 


3 


-13.0583 


4.269002 


0.225192 


Rra 

LJI a 


n 


Rhab 8 

1 XI IQU \J 


2 


-17.5284 


-1.24629 


-0.73811 


Rra 

LJI CI 


n 
1 1 


Rhab 9 

P\l 1 0 U c 


2 


-7.77349 


-11.2075 


2.246888 


Rra 
LJI a 


n 
1 1 


Rhah 10 

Ixl 1 CI L» 1 \J 


2 


-15 5038 


13.20266 


-7.3608 


Rra 

LJI d 


n 
1 1 


fsjppr 1 

I N I 1 


4 


17.45483 


-5.47981 


-14.059 


Dl d 


n 
1 1_ 




4 


24 21633 


-11.2845 


-17.2432 


Rra 
Did 




1 X L/C 1 o 


4 


30 04492 


-4 00302 


-22.484 


Rra 

LJI d 


n 
1 1 




4 


35.79165 


-3.31034 


-16.0721 


Rra 

LJI a 


n 
1 1 


PNET 1 


5 


-8.35903 


-13.5797 


5.241555 


Bra 


m . 


>NETl2 


5 


-11.4161 


-2.351 


9.501429 


Bra 


in _ 


_PNET_3 


5 


7.634704 


4.054882 


18.81268 


Bra 


ir L 


_PNET_4 


5 


-1.46068 


12.68868 


-3.29885 


Bra 


in 


PNET_5 


5 


-7.45764 


-12.1163 


7.544325 


Bra 


> n . 


_PNET_6 


5 


-0.34124 


-10.2635 


-5.81399 


Bra 


ir L 


_PNET_7 


5 


12.47783 


-8.04875 


0.715326 


Bra 


in . 


PNET_8 


5 


4.700972 


2.045522 


12.7752 



30 



Dataset A (42 samples) - class marker genes 

The top 10 marker genes per class were seleted as described in the Gene marker 
selection section. The top 3 components for that set of 50 genes are shown below. 
Notice how the clustering and separation by tissue type is now more pronounced 
than in the case og highly varying genes. This is not surprising because the 
marker genes are some of the best class separator variabless and therefore will 
produce one of the "cleanest" projections). Notice how the MD and PNET 
classestend to occupy different areas of expression space. This was not as 
evident in the highly varying genes projection. The fact that the samples separate 
so well also implies that one should be able to build a classifier that separates the 
classes with low probability of error (see the results in the Multiple tumor classes 
predictions (k-NN) section). 
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The next two plots show 2D projections of the first vs. second and second vs. third 
components. 
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The next bar graph shows the relative importance of the first components. The first 
three components account for 60.6% of the variance of the marker genes. 
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The bar graph below shows the contribution of the top 6 genes for each of the 
three principal components. The different combination of signs in each component 
is presumably a consequence of the fact that the marker genes behave as a group 
of correlated genes but also almost orthogonal across classes. 
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Dataset A1 (40 samples) - highly varying genes 

Two of the supratentorial PNETs are pineoblastomas, which historically have been 
inconsistently included in the PNET category. To study the difference it will make 
to exclude them we repeated the same PCA analysis of highly varying and marker 
genes but with the 2 pineoblastomas excluded (6 rather than 8 PNETs: dataset 
A1 ). The results are very similar as before. 
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Relative Importance of Principal Components 
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Dataset A1 (40 samples) - class marker genes 
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Relative Importance of Principal Components k 

0.258 
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Dataset A2 (90 samples) - highly varying genes 

To test whether inclusion of a larger number of medulloblastomas might lessen the 
distinctions noted in Dataset A, 50 more medulloblastoma samples were added 
and the PCA analysis for highly varying and marker genes repeated. 
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Relative Importance of Principal Components 
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Dataset A2 (90 samples) - class marker genes 
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Relative Importance of Principal Components 

0.25 
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Multiple tumor class markers 

This picture shows the top 10 markers per class as sorted by their signal to noise 
ratios as described in Gene marker selection section. The table below shows the 
top 100 markers for each tumor class including the permutation test values (see 
Permutation-based neighborhood analysis for marker gene ). 



MD 



Mqlio 



Rhab Ncer PNET 



[UOOSlBiOHD BSD 

B9&3C3C3 EaFnsaa 
mm -mn nova 
oisona □□□ 
ooa & 

■ ran l:o a 

□□□'] 

□0,' E!3 O 
□ED'T ""-OEa 

i ■ . . a .: 

a a o 

□a . • • -a 'f 
ia • 

E3 I 00 ■ 

.on© :a b D 
a .a ; 
. □ a a 

■ obo sanaas 

B O ; - 

□□ " \. an on 

IS ' CS ED 



•;- e: 1! -; n;," ir:. .*:n; i:v~* r : :r .= 83' an • ' ""'>' :*v : E3'^' i 
~:i "oca! iqd"" ■;'taaiarjor!C3niai3E3 i -" , ;c3onE3onc3onE3 

C3'\ *: *0 ! .' .' 13 « "'■ " : i.-.LJHL'B, iO 

n r " cu :j JoasiamEiEJ- □ n ; acaBO eio^ n: am i 

: m r'- " . .u o. f B1C3BJ ; • "i 0"a.'v. i 

a □ EHsaaea ib la .■ bo n q 

n n. esq 'EHioEicinoHSH :0 oinora -.ansa 



E3 aOEl : DBS . W 

a a a n 

: dl:d ' o 

a DTJ E3 . 

r :a 

cmaoooEimnD 

Cr^E3 ESaUEfLJD 

a. . ia obis □ 

O a O 
□ ' 0 □ 

on at! 



■ a an a^na 



.10 ED El 

3 ■■: El B 

g; : □ 

3lZ3 O £3 □ 



nna 

' Emnnnan • 
! OOE 
E3BHE3 C3 Q 

Ban * o~n r 

QEI □ □ E3 Q 

on ' a eioffi 



S3 000 SUDS 

tn ■ a a 
E3 : □□ 
m □ pta 

□ 

ossejej on 

OE3E3E3lfflC!Ei " 

C3PE3DE3E3D 
E]g3 El □ 

□a En a ' m 
am 

El □ 

E3 E23IZ3 E13 £3 

^oesedo , a® 



{53 f~l OS 
HO. E3 E30 

aea • ■ □ 

B O • E3 
a§aOE3S3BBOBSaE 

uaiaa o esnsn 

O a : Q 

E3E o acan 

OOOOES300EO 

":j n no u 
a 

_j o ; an: :;.!. 

OOOOE3E2B3EO 

o o aao 

E3 r.ioE3oro."3rj 
mnm r .laaraao 
1 ja£3 
o jr.zi no 

"K 1 E30DOEEDI3 



•' :e a ' Ef 

aaoESo o s 
a 

... -,. a •• • , 

oaasjoa : 

E] E} E2 



: GHD ElO oraoa .J 
EJCgBB : j' O E3 ' ' 

Eaoas . £3 n is 

E3 i OO O E3 

n ['"is a 'Eaass 

5300E3 EJ 

oaraa maisoa b 

0 OE3 a . IB 

aiBHEi obib o^ia 
.-. ei a znmm ..a 

O El O' {S3EJ O a 

□ SO EO 

a 

a qb 

. 1 ; □ n^HEa urn 

• O O ' - 

o ■ 

0 

Earnnnnea anna 
E30EB] ■; @o ao 

» .'. ; a . 

E3 DO : . S3 

u . no 

a □ ' o an 

£1 . ■ E3 E3EI3 

; aaeo . ; 

n ;:E3aaE3 a 

□ no ■ ' o 
a aaaaoE n 
m m aoo a 

£3" Baa :a 

Ei OE3 QBOBSOO O 

a mm a 
b b oa nm 
m saa o 
aosa 

.. 0^; . .. a 



M 3044 B 

S6Z240 

DS0004 

D7M3J 

XS3S41 

X82534 

MM7M 

U26728 

H0311-HT311 

X86693 

M 93426 

U4S70S 

X8680I 

U4S95S 

U53204 

X13918 

D872S6 

Z31560 

M328S8 

J 04 184 

M1212S 

0299S8 

017400 

083174 

D83735 

D 64454 

I.3SB6S 

U 12483 

08000$ 

087463 

U909O2 

026070 

X63578 

Z15109 

L35592 

L10338 

L33243 

L77864 

J 04 4 8 9 

M 80397 

X 14830 

U97016 

H04178-HT4448 

K02662 

XS2228 

U 223(4 

D2M7S 

S62471 

MS49S1 



INS Ml IT 
C»ilnklnM*llb«» wbunlt 
RhoE 

KWW182 g*n> 

Zie protein 

APXL Apical prokin 

HMQ2K«l>-mobl8y goup 

NSCL-1 

11 bf*.hydniy«(rold4ahydrog*ni»*typi I 
.RbetoiMl Prot»lnL30 
High indoMhl vartMla 
PTPRZ Peiah ^nm» phoiphpni* 
DOR g«n* 

MajBf a*roey(lepha«phep>ebiln PEA,tS 

Nauronclmamprins glyosprolah Mfb 

Ptodbi PLEC1) 

LDL-f«cptcx r.kt.d pfokin 

»mi n proktMi w h KM" 4>l nd*ig m of 

S0X2 SRV (»i itshirmlnlngraglan Y)-boi 2 

SRI S or dr. 

RPS3Rbaiomilprel*inS3 
3k»l»t«l b.».»opom)0»n 
KMW116 g«n« 

PTS 6-pyruv«yktlrah)(lropta4nsynlh«>« 
CBPIColagarvblndaigpratain 1 
AduthaartnANA tor nau»al calponln 
UOP>g«laebaa tanibator 
Thrembo9ond<n3(THBS3) 
RPSIIRIbeiamalpiotahSII 
KtAA01«3 g«n* 
KWA0273 gana 
Ctona 23612mRNA aa«|uanca 
Typa 1noa«aM.4j.|iapl 



PRKC2 ProhlnklnaaaC.zak 

Garmlna mRNA aaquanca 

SCN1B Sodkjm ch.nn.l 

PKDlPoV<*.fc blnay dtoaia polah 1 

SlaNka prcrtaii (Fa6$) 

M toehondf kl <ra at kia kna* a (CKM T) 

POL01 Pgynaraat (DMA dittctadl dalk 1 

CHRMB1 Choi na ijlcfic.pt or. nicotic, bik pO(rP*pM* ' 

E chbodarm m ic oiu bu ia-a» »e ktad preki n hamoteg H uE MA P 

AH7 

ISHDgana 

MUC1 Muen t. tintmambana 
NauralraalrielVa ilancar Itdoi 
I nduobl a nil ic otkja aynhaaa gana 

SSX3 

Human atkl natauratie Caeiof gana 



-3o 



-2a 



-1a 0 +1a 

a ~ standard deviztion from mean 



+2a 



+ 3a 



To 100 Marker Genes for each Tumor Class 
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Permutation test 
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0.68034023 


0.5169537 


D80004_at 




class 0 


0.7492524 


0.7835767 


0.6664746 


0.5046996 


D76435_at 




class 0 


0.7383032 


0.77384007 


0.6535448 


0.4954919 


X83543_at 




class 0 


0.73376894 


0.7426002 


0.6453689 


0.48881397 


X62534_s_at 






0 7^127195 


0 71893577 


0 637871 

w » WW r w f 1 


0.48173288 


M96739 at 




class 0 


0.71544206 


0.7368223 


0.63101006 


0.4792574 


U26726_at 




Alacc n 


0 70fifW3fifi 
U. / uoutooo 


0 7280188 


0 6192388 


0 47128072 


HG311-HT311 


at 


class 0 


0.66780347 


0.7201379 


0.6169249 


0.4645534 


X53331_at 




class 0 


0.6607844 


0.7197015 


0.6104924 


0.46066648 


M14483_ma1_ 


_s_at 


class 0 


0.6518439 


0.707669 


0.6029651 


0.4572201 


Z69915_at 






0 646*55346 


0 7015992 

\J . r v I w w w 


0.59863406 


0.45266396 


L00022_s_at 




class 0 


0.64252174 


0.6809993 


0.5968048 


0.4494678 


U31382_at 




class 0 


0.63783944 


0.6795276 


o.oyoo4uio 


U.440/ OD£ 


Z23064_at 




class 0 


0.6361946 


0.67848146 


0.5931882 


0.44210848 


D82345_at 




class 0 


0.60254765 


0.6678708 


0.5887515 


0.43895075 


U05012_s_at 




class 0 


0.5855725 


0.6635907 


0.58740836 


0.43586975 


X87852_at 




class 0 


0.5815485 


0.6612682 


0.58239955 


0.43329534 


HG1612-HT1612_at 


class 0 


0.5815067 


0.65672636 


0.58039063 


0.43086103 


U32315_at 




class 0 


0.5648414 


0.65436137 


0.5758365 


0.42883328 


X05855_s_at 




class 0 


0.56469566 


0.653728 


0.5696867 


0.4266686 


X13546_ma1_ 


at 


class 0 


0.5588449 


0.6535496 


0.56948066 


0.4244893 


M19720_ma2_ 


.at 


class 0 


0.5560505 


0.65252143 


0.56585836 


0.4232427 


L33930_s_at 




class 0 


f\ CCAO/l/f 

0.550ii44 


U.DD104oy 






L06797_s_at 




class 0 


0.53802216 


0.6500193 


0.5646127 


0.41931587 


M23613_at 




class 0 


0.53711265 


0.64934963 


0.558147 


0.41818365 


X02404_at 




class 0 


0.53183836 


0.6488311 


0.5571701 


0.41529867 


L10838_at 




rlaee ft 


D 528T321 


0 6434776 


0 5570184 

V « W W 9 \J 1 W*T 


0.41269392 


S82024_at 




class 0 


0.5268076 


0.64197075 


0.55182576 


0.411174 


M11433_at 




class 0 


0.51967216 


0.64013463 


0.5506556 


0.4098725 


HG3088-HT3263_at 


class 0 


0.5175459 


0.63753295 


0.5503224 


0.40737128 


U28686_at 




class 0 


0.50714874 


0.6362442 


0.5472199 


0.40469232 


L40386_s_at 




class 0 


0.5039181 


0.6355672 


0.5450284 


0.4028947 


Z11502_at 




class 0 


0.50178665 


0.6329165 


0.54309994 


0.40131387 


X55733_at 




class 0 


0.50157803 


0.6304573 


0.54162425 


0.4005091 


HG4318-HT4588_s_at 


class 0 


0.5009597 


0.6291464 


0.54044646 


0.39848644 


U30521_at 




class 0 


0.5009336 


0.623241 


0.53552544 


0.39617687 


X74330_at 





Desc 

INSM1 Insulinoma-associated 1 

(symborprovisional) 

Casein kinase II beta subunit mRNA 

RhoE 

Homeodomain protein (Prox 1) mRNA 
K1AA0182 gene, partial cds 
Zic protein 

APXL Apical protein (Xenopus laevis- 
iike) 

HMG2 High-mobility group (nonhistone 
chromosomal) protein 2 
NSCL-1 mRNA sequence 

1 1 beta-hydroxysteroid dehydrogenase 
type II mRNA 
Ribosomal Protein L30 

MGP Matrix protein gla 

PTMA gene extracted from Human 
prothymosin alpha mRNA 
mRNA (clone ICRFp507L1876) 

IG EPSILON CHAIN C REGION 

G protein gamma-4 subunit mRNA 

HNRPG Heterogeneous nuclear 
ribonucleoprotein G 
NB thymosin beta 

NTRK3 Neurotrophic tyrosine kinase, 
receptor, type 3 (TrkC) 
SEX gene 

Macmarcks 

Syntaxin 3 mRNA 

EEF1G Translation elongation factor 1 
gamma 

Put. HMG-17 protein gene extracted 
from Human HMG-17 gene for non- 
histone chromosomal protein HMG-17 
L-myc gene (L-myc protein) extracted 
from Human L-myc protein gene 
CD24 signal transducer mRNA and 3' 
region 

PROBABLE G PROTEIN-COUPLED 
RECEPTOR LCR1 HOMOLOG 
NPM1 Nucleophosmin (nucleolar 
phosphoprotein B23, numatrin) 
CALCB Calcitonin-related polypeptide, / 
beta 

PRE-MRNA SPLICING FACTOR 

SRP20 

SCG10 

RBP1 Cellular retinol-binding protein 

Splicing Factor Sc35, Alt Splice Form 3 

Putative RNA binding protein RNPL 
mRNA 

DP2 (Humdp2) mRNA 
ANNEXINXIII 

EUKARYOTIC INITIATION FACTOR 
4B 

Lim-Domain Transcription Factor Lim-1 

FRAP FK506 binding protein 12- 
rapamycin associated protein 
PRIM1 DNA primase polypeptide 1 
(49kD) 
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class 0 


0.5008229 


0.62009233 


0.535334 


0.3954302 X74262_at 


RETINOBLASTOMA BINDING 










PROTPIM P4R 


class 0 


0.49198747 


0.61935806 


0.5346635 


0.39358965 U70862_at 


Nuclear factor 1 B3 mRNA 


class 0 




U. O 1 OU3t 


n 5^44887 


0 3920796 U79255 at 


X1 1 protein mRNA, partial cds 


class 0 


0.48494342 


0.6179998 


0.5340308 


0.390908 HG613-HT613_at 


Ribosomal Protein S12 


class 0 


0.4806928 


0.61442095 


0.5297601 


0.38958547 X07438_s_at 


DNA for cellular retinol binding protein 












(CRBP) exons 3 and 4 


class 0 


0.47977164 


0.61286503 


0.52914625 


0.38840055 X56465_at 


Znf6 mRNA for zinc finger transcription 












factor 


class 0 


0.47824362 


0.61250454 


0.52643156 


0.38764188 U47414_at 


Cyclin G2 mRNA 


class 0 


0.4758209 


0.6121982 


0.5250884 


0.38578293 L37043_at 


CSNK1E Casein kinase 1, epsllon 




U.HI JO^ IOD 


0 6104803 


0 5240915 


0.38462755 U02031 at 


Sterol regulatory element binding 












protein-2 mRNA 


class 0 


0.4750675 


0.60672534 


0.52364206 


0.38363606 U21090_at 


DNA polymerase delta small subunit 












mRNA 


place n 


0 46QQ813 


0 5989572 


.0.5215285 


0.3821684 U26312_s_at 


Heterochromatin protein HPIHs- 












gamma mRNA 


class 0 


0.46889964 


0.5977149 


0.51875275 


0.38159114 M96740_at 


HELIX-LOOP-HELIX PROTEIN 2 


class 0 


0.46830228 


0.59540033 


0.51809573 


0.38079777 D55716_at 


DNA REPLICATION LICENSING 










FACTOR CDC47 HOMOLOG 


class 0 


0.467083 


0.5951792 


0.5178332 


0.37955925 X52966_at 


RPL35A Ribosomal protein L35a 


class 0 


0.4641961 


0.5948711 


0.51752603 


0.37854284 Y09836_at 


3'UTR of unknown protein 


class 0 


0.45807302 


0.59466755 


0.5157532 


0.37741286 U43885_at 


Grb2-associated binder-1 mRNA 


class 0 


0.45774597 


0.5933097 


0.5148932 


0.37650257 X69398_at 


CD47 CD47 antigen (Rh-related 



antigen, integrin-associated signal 
transducer) 

class 0 0.45630834 0.58962584 0.5139473 0.37543085 X76029_at NEUROMEDIN U-25 PRECURSOR 

" class 0 0.45350492 0.5884134 0.5138038 0.3743853 M13241_at N-MYC PROTO-ONCOGENE 

PROTEIN 

class 0 0.4528938 0.5882923 0.5130492 0.37293357 D28423_at Pre-mRNA splicing factor SRp20, 

5'UTR (sequence from the 5'cap to the 
start codon) 



class 0 


0.45122162 


0.58495116 


0.5122993 


0.3718915 


U73304_ 


rna1_at 


CB1 cannabinoid receptor (CNR1) gene 


class 0 


0.4486972 


0.5828123 


0.5106254 


0.37171924 


U17195. 


.at 


A-kinase anchor protein (AKAP100) 
















mRNA 


class 0 


0.44740826 


0.58226925 


0.5075315 


0.3709006 


M93415 


jat 


ACVR2 Actlvin A receptor, type II 


class 0 


0.44720528 


0.58122206 


0.50601745 


0.36981982 


M93650. 


_at 


Paired box gene (PAX6) homologue 


class 0 


0.44666263 


0.5794936 


0.50538707 


0.3680578 


X85545_ 


at 


Protein kinase, PKX1 


class 0 


0.44536316 


0.5779683 


0.50191903 


0.36731505 


S76475_ 


at 


NTRK3 Neurotrophic tyrosine kinase, 
















receptor, type 3 (TrkC) 


class 0 


0.44418713 


0.5775526 


0.5014425 


0.366797 


U00802_ 


_s_at 


Drebrin E 


class 0 


0.44110203 


0.57679415 


0.49992916 


0.36611393 


M60299 


_at 


Alpha-1 collagen type II gene, exons 1, 
















2 and 3 


class 0 


0.43829215 


0.5762043 


0.49976385 


0.36523366 


U16954_ 


.at 


(AF1q) mRNA 


class 0 


0.43407452 


0.5753858 


0.49807757 


0.36481872 


X99657_ 


at 


Protein containing SH3 domain, 
















SH3GL2 


class 0 


0.43262523 


0.5744567 


0.49757218 


0.3631796 


X76132_ 


at 


DCC Deleted in colorectal carcinoma 


class 0 


0.43084678 


0.5735875 


0.4974876 


0.36188462 


U85193_ 


_at 


Nuclear factor I-B2 (NFIB2) mRNA 


class 0 


0.43024966 


0.5730583 


0.4963776 


0.36167425 


M82919. 


_at 


GABRB3 Gamma-aminobutyric acid 














(GABA) A receptor, beta 3 


class 0 


0.42832303 


0.5729875 


0.4960867 


0.36009976 


M27691. 


_at 


CAMP-RESPONSE ELEMENT 














BINDING PROTEIN 


class 0 


0.42654628 


0.57097447 


0.4958619 


0.3594544 


L22005_ 


at 


UBIQUITIN-CONJUGATING ENZYME 














E2-CDC34 COMPLEMENTING 


class 0 


0.42234343 


0.56863326 


0.49558508 


0.35884386 


L76159_ 


at 


FRG1 mRNA 


class 0 


0.41938537 


0.56803626 


0.4927454 


0.35822877 


U39226. 


.at 


Myosin VIIA (USH1B) mRNA 


class 0 


0.41719937 


0.5676926 


0.49170917 


0.3565564 


U38810_ 


.at 


Mab-21 cell fate-determining protein 














homolog (CAGR1) mRNA 


class 0 


0.41676784 


0.5640895 


0.49088645 


0.35503575 


U25034_ 


_s_at 


Neuronatin alpha mRNA 


class 0 


0.41542533 


0.56285304 


0.48999164 


0.35428312 


U24576. 


.at 


Breast tumor autoantigen mRNA, 
















complete sequence 


class 0 


0.4132539 


0.5625111 


0.4890851 


0.35347974 


U23803, 


.at 


Heterogeneous ribonucleoprotein AO 














mRNA 


class 0 


0.4109598 


0.56244606 


0.48815268 


0.35334146 


M83822. 


_at 


Beige-like protein (BGL) mRNA, partial 
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IrlQOO KJ 


0 4100799 


0.562175 


0.48733872 


0.3529259 


U09087_s_at 


class 0 


0.4079055 


0.5613772 


0.48599333 


0.35199758 


M91670_at 


class 0 


0.4040995 


0.5606459 


0.48535928 


0.350769 


U25789_at 


Hsiqq 0 


0 40371 194 


0 5600623 


0.48488784 


0.35030034 


M62843_at 




0 40356442 


0 55962336 


0 483534 


0.34974435 


U09953_at 




0 40279832 


0 5594289 


0 48338398 


0.34923753 


U31814_at 


class 0 


0.4027233 


0.55762196 


0.48283297 


0.34894606 


X64229_at 


class 0 


0.40206712 


0.55735785 


0.4824586 


0.34788424 


U54999_at 


class 0 


0.40179467 


0.5565962 


0.48214018 


0.34594992 


X70683_at 


class 0 


0.3995434 


0.55561066 


0.48126557 


0.34553674 


U07919_at 


class 0 


u.oy ( oof oo 






0 ^451 1 587 

U.Otj 1 1 JO f 


MR4158 at 


class 0 


0.39684495 


0.5548959 


0.48008677 


0.3444539 


U19878_at 


class 0 


0.39449787 


0.55458045 


0.47979704 


0.3438485 


AFFX- 

HUMRGE/M10098 I 
at 

J03827_at 


class 0 


0.39352742 


0.5533359 


0.47933868 


.0.342717 


class 0 


0.39003983 


0.5530556 


0.4786078 


0.3423478 


U61145_at 


class 0 


0.38979462 


0.5514548 


0.47775155 


0.34202746 


HG662-HT662_at 


class 0 


0.38538435 


0.55026263 


0.4775768 


0.34102225 


M73047_at 


class 0 


0.38463777 


0.5502042 


0.4763407 


0.34075052 


D85131_s_at 


class 1 


1.6520017 


0.9831643 


0.84544426 


0.6230137 


X86693_at 


class 1 


1.2436218 


0.88150144 


0.7559189 


0.5795857 


M93426_at 


class 1 


1.2317128 


0.86047184 


0.70928395 


0.5539352 


U48705_rna1_s_at 


class 1 


1.2259983 


0.8433512 


0.68909335 


0.5358038 


X86809_at 


class 1 


1.214929 


0.8281318 


0.6849929 


0.5217813 


U45955_at 


class 1 


1.2095517 


0.79365546 


0.6711517 


0.510208 


U53204_at 


class 1 


1.2026114 


0.7930142 


0.6636111 


0.50219953 


X13916_at 


class 1 


1.1869695 


0.77752584 


0.65392506 


0.49156818 


D87258_at 


class 1 


1.1676904 


0.7709572 


0.6380772 


0.48596418 


Z31560_s_at 


class 1 


1.1604098 


0.76437885 


0.63309973 


0.47967565 


M32886_at 


class 1 


1.1558465 


0.761579 


0.62335235 


0.47505242 


D16181_at 


class 1 


1.1461633 


0.76131815 


0.61996907 


0.4696402 


U48250_at 


class 1 


i . \ zooyyo 




U.O I UOZ I f H 


u.*tD*f ouo ( y 


nfi'*878 pt 


class 1 


1.0904534 


0.74650514 


0.6089377 


0.46025795 


K03189J_at 


class 1 


1.0883032 


0.74052924 


0.6014309 


0.4566901 


U52155_at 


class 1 


1.0646937 


0.73030424 


0.598755 


0.45272368 


L11373_at 


class 1 


1.0544381 


0.72683483 


0.5921049 


0.4487451 


M21551jna1_at 


class 1 


1.0439421 


0.7250801 


0.5877407 


0.4458085 


Z50022_at 


class 1 


1.0364326 


0.7139357 


0.58473366 


0.44330922 


HG620-HT620_at 


class 1 


1.0299566 


0.7118054 


0.5835643 


0.4414981 


M21904_at 


class 1 


1.026406 


0.70610374 


0.58319974 


0.44004935 


D38522_at 


class 1 


1.0112586 


0.7057017 


0.5741648 


0.4358882 


Z50781_at 


class 1 


1.0110288 


0.703144 


0.5740614 


0.43265316 


X54673_at 



Thymopoietin beta mRNA 

Ubiquitin carrier protein (E2-EPF) 
mRNA 

Ribosomal protein L21 mRNA 

PARANEOPLASTIC 
ENCEPHALOMYELITIS ANTIGEN 
HUD 

RPL9 Ribosomal protein L9 

Transcriptional regulator homolog 
RPD3 mRNA 
DEK PROTEIN 

LGN protein mRNA 

SOX4 SRY (sex determining region Y)- 
box 4 

ALDH6 Aldehyde dehydrogenase 6 

Rhom-3 gene, exon 

Transmembrane protein mRNA 

AFFX-HUMRGE/M 10098_M_at 
(endogenous control) 

DbpB-like protein mRNA 

Enhancer of zeste homolog 2 (EZH2) 
mRNA 

Epstein-Barr Virus Small Rna- 

Associated Protein 

TPP2 Tripeptidyl peptidase II 

Myc-associated zinc-finger protein of 
human islet 

High endothelial venule 

PTPRZ Protein tyrosine phosphatase, 
receptor-type, zeta polypeptide 
Receptor tyrosine kinase DDR gene 

Major astrocytic phosphoprotein PEA- 
15 

Neuronal membrane glycoprotein M6b 
mRNA, partial cds 
Plectin (PLEC1) mRNA 

LDL-receptor related protein 

Cancellous bone osteoblast mRNA for 
serin protease with IGF-binding motif 
SOX2 SRY (sex determining region Y)- 
box2 
SRI Sorcin 

PMP2 Peripheral myelin protein 2 

Protein kinase C-binding protein 
RACK1 7 mRNA, partial cds 
PROBABLE PROTEIN DISULFIDE 
ISOMERASE ER-60 PRECURSOR 
Chorionic gonadotropin (hcg) beta 
subunit mRNA 

Inward rectifier potassium channel 
Kir1.2 (Kir1.2) mRNA, partial cds 
Protocadherin 43 mRNA for abbreviated 
PC43 

Neuromedin B mRNA 

Surface glycoprotein 

Tyrosine Phosphatase, Epsilon 

MDU1 Antigen identified by monoclonal 
antibodies 4F2, TRA1.10, TROP4, and 
T43 

KIAA0080 gene, partial cds 

Leucine zipper protein 

SLC6A1 Solute carrier family 6 
(neurotransmitter transporter, GABA), 
member 1 
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class 1 0.9994149 0.6964176 

Class 1 0.99073607 0.68848604 

class 1 0.98774695 0.68189645 

class 1 0.9855738 0.6807615 

class 1 0.98304486 0.6677363 

class 1 0.9786299 0.6667506 

class 1 0,97468305 0.66582626 

class 1 0.9741546 0.6632827 

class 1 0.97182816 0.6625792 

class 1 0.9682741 0.6590203 

class 1 0.9622625 0.6580188 

class 1 0.9585737 0.6577229 

class 1 0.9569103 0.65765256 

class 1 0.95592564 0.6532676 

class 1 0.94691986 0.6528938 

class 1 0.9426242 0.65049857 

class 1 0.9419619 0.65049285. 

class 1 0.93842375 0.6503094 

class 1 0.9366897 0.6481403 

class 1 0.93209916 0.64719605 

class 1 0.92833 0.64508885 

class 1 0.92812437 0.64227355 

class 1 0.9218112 0.64078844 

class 1 0.92119294 0.63995963 

class 1 0.9184531 0.6391437 

class 1 0.9155559 0.6370653 

class 1 0.90545046 0.63566005 

class 1 0.8919348 0.63485956 

class 1 0.8909992 0.63345426 

class 1 0.8867211 0.6329042 

class 1 0.8820352 0.63253194 

class 1 0.87914836 0.6318433 

Class 1 0.87837005 0.6309211 

class 1 0.87823164 0.62923473 

class 1 0.8760671 0.6286636 

class 1 0.8760668 0.62734014 

Class 1 0.87513036 0.6266346 

class 1 0.8745863 0.6246063 



0.5672535 


0.43000817 


M63623. 


_at 


0.56642866 


0.42853382 


M97796. 


_s_at 


0.56605524 


0.4269275 


L22214_ 


at 


0.564807 


0.42519456 


M23254. 


_at 


0.5630516 


0.42373672 


S80905. 


.Lat 


0.56217647 


0.4208039 


M32304. 


_s_at 


0.5576709 


0.41835085 


U79272.. 


.at 


0.5567529 


0.4152393 


D25217_ 


_at 


0.5549602 


0.41295442 


U59877. 


_s_at 


0.5520322 


0.4113537 


U07807_ 




0.54826355 


0.4092988 


D 14689. 


_at 


0.5479787 


0.4066317 


X98085. 


.at 


0.547915 


0.40562418 


D49817_ 


_at 


0.543061 


0.40421706 


M 1 6424 


_at 


0.54168797 


0.40255094 


M62302. 


_at 


0.5364322 


0.4010224 


L32961_ 


at 


0.53504795 


0.4002328 


S56151_ 


_s_at 




0 1QQ0202 


U90547 


at 






U7R^88 
\J I oooo. 


at 


0.53102833 


0.39576787 


L24559_ 


at 


0.5292084 


0.39434457 


D79999_ 


_at 


0.5264592 


0.3926984 


S73591_ 


.at 


Q.o2d0d547 


o.jy iouu/4 


A04D0 1 _ 


a i 


0.5255806 


0.39071545 


U 12707. 


_s_at 


0.5252529 


0.38848042 


X75958, 


.at 


0.5248767 


0.38763252 


X04828. 


at 


0.52347106 


0.38607994 


S 82297. 


at 


0.52277017 


0.384728 


S45630. 


at 


0.51961327 


0.38367814 


D13631_ 


A_at 


0.5160313 


0.38311574 


U80226. 


_s_at 


U.O I OO I 'l-D 








0.514657 


0.38108054 


U38980. 


.at 


0.5133504 


0.37988254 


Y00796. 


.at 


0.5121836 


0.37843332 


J03040_ 


at 


0.5113878 


0.37749302 


X55740. 


.at 


0.51018846 


0.37679052 


X92475. 


.at 


0.5089304 


0.3758733 


U69263. 


_at 


0.5083398 


0.3742895 


U55258 


.at 



MOG Myelin oligodendrocyte 
glycoprotein 

ID2 Inhibitor of DNA binding 2, 
dominant negative helix-loop-helix 
protein 

ADORA1 Adenosine receptor A1 

CAPN2 Calpain, large polypeptide L2 

PRB2 locus salivary proline-rich protein 

mRNA, clone cP7 

TIMP2 Tissue inhibitor of 

metalloproteinase 2 

Clone 23720 mRNA sequence 

KIAA0027 gene, partial cds 

Low-Mr GTP-binding protein (RAB31) 
mRNA 

Metallothionein IV (MTIV) gene 

NUCLEAR PORE COMPLEX PROTEIN 
NUP214 

TNR Tenascin R (restrictin, janusin) 

Fructose 6-phosphate,2-kinase/fructose 
2,6-bisphosphatase 
BETA-HEXOSAMINIDASE ALPHA 
CHAIN PRECURSOR 
Growth/differentiation factor 1 (GDF-1) 
mRNA 

4-AMINOBUTYRATE 
AMINOTRANSFERASE, 
MITOCHONDRIAL PRECURSOR 
HMFG 

Ro/SSA ribonucleoprotein homolog 
(RoRet) mRNA 
Steroidogenic factor 1 mRNA 

POLA DNA polymerase alpha subunit 

KIAA0177 gene, partial cds 

Brain-expressed HHCPA78 homolog 
[human, HL-60 acute promyelocytic 
leukemia cells, mRNA, 2704 nt] 
TYK2 Protein-tyros ine kinase tyk2 (non- 
receptor) 

WAS Wiskott-Aldrich syndrome 
(ecezema-thrombocytopenia) 
TrkB {alternatively spliced} [human, 
brain, mRNA, 1870 nt] 
GNAI2 Guanine nucleotide binding 
protein (G protein), alpha inhibiting 
activity polypeptide 2 
BETA-2-MICROGLOBULiN 
PRECURSOR 
r CRYAB Crystallin alpha-B 

KIAA0006 gene 

Gamma-aminobutyric acid 
transaminase mRNA, partial cds 
Thyroid receptor interactor (TRIP9) 
gene 

PMS8 mRNA (yeast, mismatch repair 
gene PMS1 homologue), partial cds (C- 
terminal region) 

ITGAL Integrin, alpha L (antigen CD1 1A 
(p180), lymphocyte function-associated 
antigen 1; alpha polypeptide) 
SPARC SPARC/osteonectin 

NT5 5' nucleotidase (CD73) 

ITBA1 protein 

Matrilin-2 precursor mRNA, partial cds 

H BRAVO/Nr-CAM precursor 
(hBRAVO/Nr-CAM) gene 
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class 1 


0.86930543 


0.62451684 


0.5073476 


0.37309632 S50017_s_at 


class 1 


0.8662232 


0.6241277 


0.50710213 


u.o/ io4yoo A/or u_at 




0.84933835 


0.6234191 


0.5067336 


0.37121156 U28368_at 


class 1 


0.8469514 


0.62305886 


0.5062817 


0.37057114 X74794_at 


class 1 


0.84659636 


0.6230239 


0.50575304 


0.36982706 X75861_at 


class 1 


0.8460863 


0.6192708 


0.50347847 


0.36910838 M83233_at 


class 1 


0.8459389 


0.61905175 


0.5014394 


0.36789283 U73328_at 


class 1 


0.8424674 


0.6189927 


0.5009286 


0.36755785 M95936_s_at 


class 1 


0.8386806 


0.6188733 


0.49939558 


0.36682224 X85786_at 


class 1 


0.8372769 


0.61818075 


0.49734905 


0.36528903 U14394_at 


class 1 


0.8356367 


0.6158615 


0.49629536 


0.36451015 D63486_at 


class 1 


U.O»3ZZO*tr / 




n 494^158 

\ JO 


0 36395043 Z68280 Cds2 S at 


class 1 . 


0.83206296 


0.6120459 


0.4930411 


0.36299255 U89335_cds2_at 


class 1 


0.8239408 


0.61198014 


0.49286792 


0.36167002 U00928_at 


class 1 


0.8234844 


0.61135525 


0.49238437 


0.36087698 X00274_at 


class 1 


0.82109606 


0.6106854 


0.49044895 


0.36035484 M80244_at 


class 1 


0.81676286 


0.61059463 


0.48990437 


0.3597854 D49410_at 




U.O I UO 1 O f 


0 8101961 


0 48990065 


0.35875112 L14813 at 


class 1 


0.80911857 


0.61003023 


0.48817563 


0.35794193 M77016_at 


class 1 


0.8083867 


0.60995644 


0.48780048 


0.3571 8402 Y08265_s_at 


class 1 


0.80540043 


0.6097752 


0.48703083 


0.35631403 M69023_at 


class 1 


0.8024884 


0.6077933 


0.4849392 


0.35507387 M11749_at 


class 1 


0.8011927 


0.60732687 


0.48423845 


0.35416168 X59892_at 


class 1 


0.7992353 


0.6069737 


0.48395756 


0.3537728 HG987-HT987_at 


class 1 


0.79868704 


0.6057914 


0.48377454 


0.35269728 X83863_at 


ciass i 


ft 7Qft^7Q1 


fi R0^7991 

U.DUOf 1 


0 48^1^245 


0 35175043 X55666 at 

W » W W 1 1 WW"W AXWWWW W. L 


class 1 


0.7961344 


0.604588 


0.48308864 


0.3515281 U30930_at 


HaQQ 1 
Uldoo I 


0 79439163 


0 6007642 

v.wWf w~fc» 


0.4818679 


0.35095778 U19261_at 


class 1 


0.79308397 


0.59980625 


0.48032397 


0.35053596 L42601 J_at 


place 1 


0 79107213 


0 5997507 

w.X/UJ f WW f 


0.48028195 


0.35026234 U46023_at 


class 1 


0.79009724 


0.5991178 


0.4783936 


0.34935814 M10612_at 


class 1 


0.78904396 


0.5987258 


0.4782325 


0.34850967 U79528_s_at 


class 1 


0.788058 


0.5983545 


0.4774304 


0.34845862 Z49825_s_at 


class 1 


0.7867659 


0.59789264 


0.4762901 


0.34749532 D84145_at 


class 1 


0.78471535 


0.59755576 


0.47509775 


0.34661785 Z11899_s_at 


class 1 


0.7831359 


0.59647435 


0.47479355 


0.34539047 D63135_at 


class 1 


0.78272295 


0.5960181 


0.47357324 


0.3446057 U32680_at 



CNP 2\3'-cyclic nucleotide 3' 
phosphodiesterase 
MTILMetallothionein 1L 

ID4 Inhibitor of DNA binding 4, 
dominant negative helix-loop-helix 
protein 

CDC21 HOMOLOG 

TEGT Testis enhanced gene transcript 

TCF12 Transcription factor 12 (HTF4, 
helix-loop-helix transcription factors 4) 
DLX7 Distal-less homeobox 7 
AKT2 V-akt murine thymoma viral 
oncogene homolog 2 
BINDING REGULATORY FACTOR 

METALLOPROTEI NASE INHIBITOR 3 

PRECURSOR 

KIAA0152 gene 

Erythrocyte adducin alpha subunit gene 
extracted from Human DNA sequence 
from cosmid L25A3, Huntington's 
Disease Region, chromosome 4p16.3 
contains Human tetracycline 
transporter-like protein and erythrocyte 
adducin alpha subunit, multiple ESTs 
and a putative CpG island 
NOTCH4 gene (notch4) extracted from 
Human HLA class III region containing 
notch4 (NOTCH4) gene, complete 
sequence 

RNA-BINDING PROTEIN FUS/TLS 
HLA CLASS II HISTOCOMPATIBILITY 
ANTIGEN, DR ALPHA CHAIN 
PRECURSOR 

INTEGRAL MEMBRANE PROTEIN 
E16 

IL3RA Interleukin 3 receptor, alpha (low 
affinity) 

CELL Carboxyl ester lipase like protein 
TMOD Tropomodulin 
DAN26 protein, partial 
Globin gene 

THY-1 MEMBRANE GLYCOPROTEIN 

PRECURSOR 

TRYPTOPHANYL-TRNA 

SYNTHETASE 

Mac25 

PTGER3 Prostaglandin E receptor 3 
(subtype EP3) {alternative products} 
Usf mRNA for late upstream 
transcription factor 
CGT UDP-galactose ceramide 
galactosyl transferase 
Epstein-Barr virus-induced protein 
mRNA 

KERATIN, TYPE II CYTOSKELETAL 
6D 

Xq28 mRNA 

APOC2 Apolipoprotein C-ll 
Sigma receptor mRNA 
HEPATOCYTE NUCLEAR FACTOR 4 
WS-3 mRNA 

POU5F1 Octamer binding protein 3 

ETS-like 30 kDa protein 

CLN3 Ceroid-lipofuscinosis, neuronal 3, 

juvenile (Batten, Spielmeyer-vogt 

disease) 
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class 1 


0.78065175 


0.594793 


0.4729/ Jo 


O.o4obb4bo aooU/ y_mai_ai 


(1A A npnp pytrartp^ri frnm Human 








lysosomal alpha-glucosidase gene exon 
1 


class 1 


0.7805558 


0.5940042 


0.47264746 


0.34325802 U94747_at 


WD repeat protein HAN 11 mRNA 


class 2 


1.5964093 


0.9486641 


0.8495439 


0.62064433 J04164_at 


RPS3 Ribosomal protein S3 


class 2 


1.5496515 


0.87290615 


0.777105 


0.5718508 M12125_at 


Skeletal beta-tropomyosin 


class 2 


1.5152686 


0.827159 


0.73881376 


0.5467952 D17400_at 


PTS 6-pyruvoyltetrahydropterin 










ay i iu looc 


Hass 2 


1 .4285764 


0.8085277 


0.71054107 


0.5315226 D29958_at 


KIAA01 16 gene, partial cds 


class 2 


1.406929 


0.7890778 


0.69446135 


0.5199787 D84454_at 


UDP-galactose translocator 


class 2 


1.3972126 


n 771fi^9 

U.I 1 lOOil 


U.DOUOJi «J 


0^063799 D83174 S at 


CBP1 Collagen-binding protein 1 


class 2 


1.3882682 


0.7628288 


0.6654627 


0.4983154 D83735_at 


Adult heart mRNA for neutral calponin 


class 2 


1.3158283 


0.7600643 


0.65747064 


0.48707137 L38969_at 


Thrombospondin 3 (THBS3) gene 


class 2 


1.2211796 


a 7coc 4C7C 
0.7oob1b/b 




n 47G9GQ77 I M94fi^ at 

u.^/y^oy/ / u i^*too ai 


RPS1 1 Ribosomal protein S11 


class 2 


1.2204406 


0.74606985 


0.64249825 


0.47334 U47621_at 


Nucleolar autoantigen No55 mRNA 


Hacc 9 


1 2186558 


0 744558 


0.6345838 


0.46700227 D80005_at 


KIAA0183 gene, partial cds 


class 2 


1.2145118 


0.7413605 


0.6255762 


n ACOC-iQA V7QCQQ o of 

0.4oo51ol A/yooo_s_at 


1 AIMR9 1 aminin h«=>ta 9 /laminin 9^ 
LMIVID^. LalTllillil, weld c- yialllllllll <jj 


class 2 


1.1926116 


0.73349786 


0.62487775 


0.45888507 U73377_at 


SKI V-ski avian sarcoma viral oncogene 








homolog 


class 2 


1 . 1 00300*+ 


0 7101*549 


0 6203104 


0.4543584 L21954_at 


PERIPHERAL-TYPE 








BENZODIAZEPINE RECEPTOR 


class 2 


1.1789806 


0.70968467 


0.61982125 


0.45166668 D85418_at 


Phosphatidylinositol-glycan-class C 










/Pin r*\ 


Ha^ 9 


1 1726408 

1*11 ^VS~\/w 


0.7075023 


0.6164542 


0.4459662 U50523_at 


BRCA2 region, mRNA sequence 










CG037 


class 2 


1.1627295 


0.6977533 


0.60977054 


0.44309354 U13991_at 


TATA-binding protein associated factor 










10 kDa cubunit ftafll30^ mRNA 


class 2 


1.1177913 


0.6966777 


0.60499704 


0.4403641 L41066_at 


NF-AT3 mRNA 


class 2 


1.1 lOTOO^ 


u.oyo^oy^to 




0 4T546635 S80343 at 


RARS Arginyl-tRNA synthetase 


class 2 


1.1063769 


0.6934489 


0.6001345 


0.43275866 D78586_at 


CAD PROTEIN 


class 2 


1.100164 


0.68982345 


0.59710175 


0.43011752 Ao4oU4_at 


Mwnetn roAiilatAA/ Mnht phatn mRNA 

iviyusin reyuiciiuiy iiyiii uiidiii iiirviN/A 


Hacc 9 


1 0985785 

1 . U3UJ ( UsJ 


0 6834978 


0.59472984 


0.4265189 X94910_at 


ERp31 protein 


rlacQ 9 


1 0931795 


0.68134505 


0.59273076 


0.42424324 U31383_at 


G protein gamma-1 0 subunit mRNA 


Old bo £. 


1 075^97 


0 67965597 


0.5872831 


0.42292687 D30755_at 


VIM Vimentin 


class 2 


1 nfiART77 
i .udojo/ / 




0 5853282 


0.42069253 U70439_s_at 


PHAPI2b protein 


Globb £. 




0 6734994 


0 5838599 


0.41816902 M19645_at 


78 KD GLUCOSE REGULATED 










PROTEIN PRECURSOR 


rla<;^ 9 

V/IOOO £— 


1 0563896 


0.67214715 


0.58231425 


0.41704497 D45248_at 


Proteasome activator hPA28 subunit 
beta 


class 2 


1.0528408 


0.6703584 


0.5772028 


0.4129843 M14338_at 


PROS1 Plasma protein S 


class 2 


1.0516357 


0.6697295 


0.57507044 


0.41167092 D31888_at 


KIAA0071 gene, partial cds 


place 9 


1 .0499836 


0.6691372 


0.57322055 


0.41046417 D79996_at 


KIAA0174 gene 


class 2 


1 0487297 


0.66673857 


0.5681988 


0.40736142 U34683_at 


GSS Glutathione synthetase 


class 2 


1 .0459839 


u.bbobobyb 


U.Ob/ Z14o/ 


U.*fUOUOO%>f l_l<tOoO eU 


RSU-1/RSP-1 mRNA 


class 2 


1.0183622 


0.65781057 


• 0.5650841 


0.40312052 X61587_at 


ARHG Ras homolog gene family, 










member G (rho G) 


class 2 


1.0072109 


U.boooJzo 


A CCQQ07'tC 

U.bbsJo2/ib 


n AM, RA1AA y^777 at 


60S RIBOSOMAL PROTEIN L23 


class 2 


0.99361455 


0.6530042 


0.56308913 


0.39934984 X06700_s_at 


COL3A1 Alpha-1 type 3 collagen 


class 2 


0.98622197 


0.6510077 


0.56030434 


0.3967915 U41515_at 


Deleted in split hand/split foot 1 (DSS1) 










mRNA 


class 2 


0.9862092 


0.6501271 


0.5597317 


0.3955185 L14565_at 


PERIPHERIN 


class 2 


0.9861108 


0.65010846 


0.5552204 


0.3940714 M63573_at 


PPIB Peptidylprolyl isomerase B 










(cyclophilin B) 


class 2 


0.9772656 


0.64656097 


0.55292964 


0.39343733 Z23090_at 


HSPB1 Heat shock 27kD protein 1 


class 2 


0.9681267 


0.64562047 


0.5525739 


0.39118996 L25085_at 


PROTEIN TRANSPORT PROTEIN 












SEC61 BETA SUBUNIT 


class 2 


0.963596 


0.644272 


0.551029 


0.38998803 U72514_at 


C2f mRNA 


class 2 


0.9609764 


0.6431171 


0.5493531 


0.3888596 X15187_at 


TRA1 Homologue of mouse tumor 












rejection antigen gp96 


class 2 


0.95701075 


0.6428692 


0.5473165 


0.3870365 M29971_at 


MGMT 6-O-methylguanine-DNA 










methyltransferase (MGMT) 
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class 2 


0.95551085 


0.6414765 


0.54465944 


0.3854772 


D79997_ 


at 


KIAA0175 gene 


class 2 


u.yo4ooo i 


n R41 1 ^07^ 

U.D*r I I JU / 3 


0 54276997 


0 38295317 


Y07604 


at 


Nucleoside-diphosphate kinase 


class 2 


• 0.95175433 


0.63506126 


0.54169023 


0.38189143 


D78611_ 


.at 


MEST Mesoderm specific transcript 














/mm icol KaitiaIaa 
^IFIvJubUJ MUlMUlUy 


class 2 


0.9498536 


0.6342157 


0.54131114 


0.380591 


U84720_ 


.at 


mRNA export protein Rae1 (RAE1) 














mRNA 


class 2 


0.9433024 


0.63139164 


0.54059446 


0.37891367 


U72263_ 


,s_at 


EXT2 Exostoses (multiple) 2 


class 2 


0.94292474 


0.63092226 


0.5366494 


0.37772393 


X85373_ 


at 


Sm protein G 


class 2 


0.94150615 


0.6304973 


0.5353541 


0.37663063 


X98296_ 


at 


Ubiquitin hydrolase 


place 9 


0 9404326 


0 6302717 


0.534726 


0.37519532 


U28811_ 


at 


Cysteine-rich fibroblast growth factor 
















receptor (CFR-1) mRNA 


class 2 


0.93543696 


0.62990934 


0.5340427 


0.37421787 


U41387_ 


.at 


Gu protein mRNA, partial cds 


place 9 


0 9310851 


0 62937075 


0.5334321 


0.37231687 


L38951_ 


at 


Importin beta subunit mRNA 


class 2 


0.9302329 


U.b2oo 1 /DO 


U.0o22O004 


A 071 ARQ7Q 


M 1 1 71 ft 
IVI I I / I o_ 


at 
dl 


PDl *>A9 Cnllanpn tvne V aloha 


class 2 


0.9218427 


0.6272878 


0.5308931 


0.37046224 


X02152_ 


at 


LDHA Lactate dehydrogenase A 


class 2 


0.91466784 


0.6263975 


0.5304401 


0.36923638 


X13839_ 


at 


LCAT Lecithin-cholesterol 
















acyltransferase 


class 2 


0.91358876 


0.6234602 


0.53026843 


0.3675315 


Z25749_ 


ma1_at 


Ribosomal protein S7 




0 9135651 


0 62286264 

\J ■ WfcfcW 


0.52784836 


0.36692485 


D00763_ 


at 


GAPD Glyceraldehyde-3-phosphate 
















dehydrogenase 


class 2 


0.91283256 


0 : 6222702 


0.52783066 


0.36592916 


L25270_ 


at 


XE169 PROTEIN 


class 2 


0.9052988 


0.6209669 


0.5276215 


0.3650857 


M64098 


_at 


High density lipoprotein binding protein 
















(HBP) mRNA 


class 2 


0.90166676 


0.619184 


0.52735156 


0.36361563 


D42041_ 


at 


rxiAAUUtso gene, pamai cos 


class 2 


0.8969863 


0.6182318 


0.5259541 


0.362975 


D14043_ 


.at 


Dl ITATI\/C Ml IAIM f'PkDC DDfYTFIKI 














PRECURSOR 24 


class 2 


- 0.89683557 


O.blbool 


U.OzOUo 




nfl9Q4ft 


at 
dl 


'N-aminnimlHaTnlp-d.^a rh fiYam irip« 1 - 
















beta-D-ribonucleoti de 
















transformylase/inosinicase 


class 2 


0.8967383 


0.6154521 


0.5231539 


0.3603715 


U09587, 


.at 


GARS Glycyl-tRNA synthetase 


class 2 


0.89605194 


0.61400/0 


U. 022/0/ 4 




H7ft97^ 

U / oz / o_ 


at 


Prntoacnmp cnhiinit n47 
rHJlBdaUlllc ouuuiiu p*t&. 


class 2 


0.8807301 


0.61248505 


0.52253556 


0.35895807 


U15655_ 


.at 


Ets domain protein ERF mRNA 


class 2 


0.8791751 


0.6105275 


0.52134347 


0.3578664 


M33308. 


_at 


VCL Vinculin 


class 2 


0.87861866 


0.6101475 


0.5194662 


0.3572384 


J04456_ 


at 


LGALS1 Ubiquinol-cytochrome c 
















reductase core protein II 


class 2 


0.8758852 


0.6092619 


Q.Olo2y94b 


U. 0004/ 02 


IVIZ4U0y 


at 
_dl 


HMA-RINniNfi PROTEIN A 


class 2 


0.8756332 


0.60834265 


0.5160915 


0.3562142 


X66945. 


.at 


FGFR1 Basic fibroblast growth factor 
















(bFGF) receptor (shorter form) 


class 2 


0.8730137 


0.6069743 


0.51504296 


0.35525343 


M22382. 


_at 


HSPD1 Heat shock 60 kD protein 1 














(chaperonin) 


class 2 


0.871905 


0.6060964 


0.513713 


0.3543604 


J03191_ 


at 


Profilin mRNA 


class 2 


0.8703043 


0.603569 


0.51317245 


0.35314965 


U47926. 


_at 


Unknown protein B mRNA 


class 2 


0.86565924 


0.6020561 


0.5128095 


0.35230598 


M85289 


_at 


HSPG2 Heparan sulfate proteoglycan 


class 2 


0-.864009 


0.6016002 . 


0.51181066 


0.3514165 


M14636. 


_at 


PYGL Glycogen phosphorylase L (liver 
















form) 


place 9 


0 854852 


0 6013029 


0.5113688 


0.35065523 


S78187_ 


at 


M-PHASE INDUCER PHOSPHATASE 
2 


class 2 


0.85385114 


0.5993562 


0.5086435 


0.34998524 


S71018, 


.at 


Cyclophilin C [human, kidney, mRNA, 














ooo nij 


class 2 


0.8520942 


0.5992366 


0.5079732 


0.34953746 


M14949 


_at 


RAS-RELATED PROTEIN R-RAS 


class 2 


u.oty i 3j i 






0 3484389 


X99920 


at 


S100 calcium-binding protein A13 


class 2 


0.8471685 


0.5982197 


0.5060602 


0.34780023 


J03824_ 


at 


UROS Uroporphyrinogen III synthase 


class 2 


0.8435629 


0.598039 


0.50526935 


0.3470213 


L35240_ 


at 


Enigma gene 


Hacc 0 


0 841B742 


0 59718853 


0.5044416 


0.3458508 


X62691_ 


at 


40S RIBOSOMAL PROTEIN S15A 


class 2 


0.84077215 


0.59648985 


0.5040173 


0.34466365 


L07758_ 


at 


IEF SSP 9502 mRNA 


class 2 


0.8404173 


0.59592414 


0.50253683 


0.3441604 


U35139 


.at 


NECDIN related protein mRNA 


class 2 


0.8322671 


0.59347093 


0.5013147 


0.34315905 


U91985 


_at 


DNA fragmentation factor-45 mRNA 


class 2 


0.83103186 


0.59304476 


0.49816543 


0.34257925 


D14533. 


_at 


XPA Xeroderma pigmentosum, 
















complementation group A 


class 2 


0.83078337 


0.59284073 


0.49728844 


0.34181342 


D37965 


_at 


PDGF receptor beta-like tumor 
















suppressor (PRLTS) 


class 2 


0.8286705 


0.5926845 


0.49725458 


0.341386 


U72621. 


.at 


LOT1 mRNA 
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0.82835925 


0.59190524 


0.4970425 


0.3410285 HG1153-HT1153_at 


Nucleoside Diphosphate Kinase Nm23- 
H2s 


class 2 


0.8277117 


0.59029454 


0.49623272 


0.3406979 M16938_s_at 


Homeo box c8 protein, mRNA 


class 2 


0.8266656 


0.5901575 


f\ a r\c a 4 ica 

0.49541354 


0.34014bUb UloUO/_ai 


nPPPKinPR AfiAIN^TPFI I nPATM 1 
UCrCIVUCn nOnllNO 1 vCLL UCn 1 n 1 


class 2 


0.8260545 


0.58884704 


0.49504972 


0.3398025 L40393_at 


(clone S171) mRNA 


class 2 


0.8255064 


0.58706975 


0.49395016 


0.3389208 U40572_at 


Beta2-syntrophin (SNT B2) mRNA 


class 2 


0.82378906 


0.58681875 


0.4934995 


0.33787978 U07550_at 


HSPE1 Heat shock 10 kD protein 1 












(chaperonin 10) 


class 2 


0.8221365 


0.5852637 


0.49296644 


0.33739457 U94855_at 


Translation initiation factor 3 47 kDa 












subunit mRNA 


class 2 


0.8211939 


0.58489823 


0.49241668 


0.33726045 L13278_at 


CRYZ Crystallin zeta (quinone 












reductase) 


class 2 


0.8211335 


0.5832674 


0.4924099 


0.33619076 D00591_at 


CHC1 Chromosome condensation 1 


class 2 


0.8189326 


0.58308315 


0.49072617 


0.33576697 X70991_at 


MADER mRNA 


class 2 


0.81821126 


0.5828148 


0.49018598 


0.3352356 X97074_at 


EEF2 Eukaryotic translation elongation 












factor 2 


class 2 


0.8171171 


0.5811897 


0.48924693 


0.33457905 U68105_s_at 


PABPL1 Poly(A)-binding protein-like 1 


class 3 


4.298069 


1.8113496 


1.542546 


0.99697 D87463_at 


KIAA0273 gene 


class 3 


3.7472157 


1.5923314 


1.3552583 


0.88737005 U90902_at 


Clone 23612 mRNA sequence 


class 3 


3.690101 


1.5091044 


1.2780975 


0.8364649 D26070_at 


Type 1 inositol 1,4,5-trisphosphate 














class 3 


3.6179547 


1.4561309 


1.2265519 


0.79228497 X63578_rna1_at 


Parvalbumin 


class 3 


3.5801797 


1.3800497 


1.1935662 


0.76851 Z15108_at 


PRKCZ Protein kinase C, zeta 


class 3 


3.1552649 


1.3531709 


1.1540082 


0.7488579 L35592_at 


Germline mRNA sequence 


Class o 


9 Qft?7Q 


1 3409475 


1 1488526 


0 7350318 L10338 s at 

W.I WWW W IU It- 1 W WW W W wi 


SCN1B Sodium channel, voltage-gated, 












type I, beta polypeptide 


class 3 


2.9386811 


1.3199779 


1.1085008 


0.7183979 L33243_at 


PKD1 Polycystic kidney disease protein 
1 


class 3 


2.8076477 


1.3121926 


1.0983416 


0.70677555 L77864_at 


Stat-like protein (Fe65) mRNA 


class 3 


2.7285392 


1.3062973 


1.0917186 


0.6917782 J04469_at 


Mitochondrial creatine kinase (CKMT) 


place *\ 


o SQ8Q084 


1 2814465 


1 0786043 

Itvi V> W W~ W 


0.6817148 U92457_s_at 


gene 

Metabotropic glutamate receptor 4 












mRNA 


ciass o 


9 5474955 


1 9378745 


1 0633833 


0.67317224 D21267 at 


SYNAPTOSOMAL ASSOCIATED 












PROTEIN 25 


class 3 


2.4580472 


1.2305647 


1.0299538 


0.66360885 U79288_at 


Clone 23682 mRNA sequence 


class 3 


9 ^450984 


1 9149801 


1 0283259 


0 65945446 D63479 s at , 

\j -\J\J w" w F f w uwv~» w w CJ I ^ 


DAGK4 Diacylglycerol kinase delta 


class 3 


2.342861 


1.2094874 


1.0197432 


0.65128297 L07807_s_at 


DNM1 Dynamin 1 


class 3 


2.280001 


1.2006655 


1.0105054 


0.6466257 D31883_at 


KIAA0059 gene 


rlass 3 


2.2601855 


1.1856893 


1.0029721 


0.6399636 L13266_s_at 


GRIN1 Glutamate receptor, ionotropic, 












N-methyl D-aspartate 1 


class 3 


2.2338665 


1.1824055 


0.99219334 


0.63467205 U33632_at 


Two P-domain K+ channel TWIK-1 












mRNA 


class 3 


2.187364 


1.1767278 


0.9869156 


0.6280837 X06956_at 


TUBULIN ALPHA-4 CHAIN 


CldbS J 


9 17Q1Q85 


1 1758486 


0 9814895 


0.62358665 U52827 at 


Cri-du*chat region mRNA, clone 












NIBB11 


place *5 


9 148Q515 


1 1723996 


0 96380335 

w • wUw w V www 


0.6196844 U16296_at 


TIAM1 T-cell lymphoma invasion and 










metastasis 1 


place *5 


9 1310797 


1.1686139 


0.9591985 


0.61347145 U79289_at 


Clone 23695 mRNA sequence 


class 3 


2.1222842 


1.1660343 


0.9513746 


0.60739744 L47738_at 


Inducible protein mRNA 


class 3 


2.0701365 


1.1609877 


0.94285935 


0.60494643 U39412_at 


Platelet alpha SNAP mRNA 


class 3 


2.0604408 


1.1502872 


0.93769294 


0.59993637 M13577_at 


MBP Myelin basic protein 


class 3 


2.0528634 


1.147987 


0.934068 


0.5970684 M65066_at 


PRKAR1B Protein kinase, cAMP- 












dependent, regulatory, type I, beta 


class 3 


2.0507598 


1.1448451 


0.9319095 


0.59396625 X51956_rna1_at 


EN02 gene for neuron specific 












(gamma) enolase 


class 3 


2.0365791 


1.1420877 


0.9278926 


0.5918934 X80818_at 


GRM4 Glutamate receptor, 












metabotropic 4 


class 3 


2.0284941 


1.1411111 


0.9187019 


0.5886168 U67963_at 


Lysophospholipase homolog (HU-K5) 












mRNA 


class 3 


2.02188 


1.1368862 


0.91659665 


0.5841766 D87074_at 


KIAA0237 gene 


class 3 


2.0109 


1.130213 


0.9079699 


0.581688 D87465_at 


KIAA0275 gene 


class 3 


2.004002 


1.1282408 


0.9066855 


0.5797237 S72493_s_at 


KERATIN, TYPE I CYTOSKELETAL 17 
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class 3 


1.959235 


1.1278926 


0.8963888 


0.57711166 D63851_at 


Unc-18 homologue 


class 3 


1.9334141 


1.1088517 


0.88973886 


0.57648414 U90907_at 


Clone 23907 mRNA sequence 


class 3 


1 Q977fi/1R 


i . i uou i jo 


U.OOOJUZJ 


n ^740517 U13616 at 


ANK3 Ankyrin G 


class 3 


1.9067957 


1.1016182 


0.8826693 


0.5693379 U79245_at ' 


Clone 23586 mRNA sequence 


class 3 


1.8848716 


1.1010174 


0.8821242 


0.565312 X64838_at 


RSN Restin (Reed-Steinberg cell- 










expressed intermediate filament- 












associated protein) 


class 3 


1.8844064 


1.099194 


0.8803771 


0.5631035 D83542_at 


Cadherin-15 


class 3 


1 A7RR7R 
1 .0/ 00/ D 


1 nQRiR9R 


f) R71^QR1 


0 5R04119 U81607 at 


G RAVIN 


class 3 


1.8755924 


1 .0932494 


0.8671489 


0.55765635 M64925_at 


MPP1 Membrane protein, palmitoylated 












1 (DOKU) 


class 3 


i .ooooyoo 




n RRRRR4 


0.5553793 D78577_s_at 


" YWHAH Tyrosine 3- 












monooxygenase/tryptophan 5- 












monooxygenase activation protein, eta 












polypeptide 


uldoo O 


1 R5R2501 


1 0852915 


0 8653905 


0.55245095 U47928_at 


Protein A alternatively spliced form 2 












(A-2) mRNA 


class 3 


1.8516157 


1.0834761 


0.86097085 


0.5506602 M96859_at 


DPP6 Dipeptidylpeptidase VI 


class 3 


1.8367375 


1 .0826322 


0.85773826 


0.5468505 U76421_at 


DsRNA adenosine deaminase 












DRADA2b (DRADA2b) mRNA 


class 3 


1.8268647 


1.0788103 


0.85618746 


0.5450503 HG2259-HT2348_s_at Tubulin, Alpha 1, Isoform 44 


class 3 


1.8232589 


1.0724745 


0.8504435 


0.54230773 X14766_at 


GABRA1 Gamma-aminobutyric acid 












(GABA) A receptor, alpha 1 


class 3 


1.8144729 


1.0702177 


0.8499052 


0.5412158 U07139_at 


CAB3b mRNA for calcium channel 












beta3 subunit 


class 3 


1.8111364 


1.0686209 


0.8456019 


U.OO/DD^OO L/oo^f_ai 


tiAata Kntronir* nil ita mate rpppntnr 1 












alpha (mGluRlalpha) mRNA 


class 3 


1.7889462 


1.0664026 


0.8448219 


0.53580296 M37400_at 


GOT1 _Glutamic-oxaloacetic 












transaminase 1, soluble (aspartate 












aminotransferase 1) 


class 3 


1.7856433 


1.0653843 


0.84243757 


0.5344206 U27193_at 


Protein-tyrosine phosphatase mRNA 


ciass o 


1 77QRR4R 


1 .UDOJJO 


0 R40Q1^Q5 


0 ^IPORRT D63477 at 


KIAA0143 gene, partial cds 


ciass o 






0 R^Rft4^0S 

U.O JOLrr JUJ 


0.53067225 X92493_S_at 


o i ivw protein 


CldSS o 


1 7^R*^9R 


1 OR 1^9? 




0.5297999 X70940_s_at 


EEF1A2 Eukaryotic translation 












elongation factor 1 alpha 2 


class 3 


1.7429492 


1.0588214 


0.83477515 


0.52727365 D29013_at 


POLB DNA polymerase beta subunit 


class 3 


1.7351419 


1.0561 141 


0.8308162 


0.52565986 D79998_at 


KIAA0176 gene, partial cds 


class 3 


1.733765 


1.0514225 


0.83047813 


0.5235324 U25029_at 


GRL Glucocorticoid receptor alpha 












{alternative products} 


class 3 


1.7079235 


1 .0505984 


0.8300135 


0.52023715 J04046_s_at 


CALMODULIN 


class 3 


1.7047126 


1 .0464945 


^ Q0QQ07Q 


0.5191814 M33653_at 


COL4A2 Collagen, type IV, alpha 2 


class 3 


1.6961877 


1.0404369 


0.82708424 


0.5182239 M58583_at 


CEREBELLIN 1 PRECURSOR 


class 3 


1.6901898 


1.034222 


0.82666314 


0.51679677 M32313_at 


SRD5A1 Steroid-5-alpha-reductase, 












alpha polypeptide 1 (3-oxo-5 alpha- 












steroid delta 4-dehydrogenase alpha 1) 


class 3 


1.6850768 


1.0337527 


0.82482046 


0.51418847 D82347_at 


NEUROD1 Neurogenic differentiation 1 


class 3 


1.6842928 


1.0330929 


0.8175216 


0.5133849 D83777_at 


KIAA0 193 gene 


class 3 


1.6828924 


1.0288644 


0.8172416 


0.5115587 Z31695_at 


43 kDa inositol polyphosphate 5- 












phosphatase 


class 3 


1.6768361 


1.0284845 


0.81461614 


0.51045823 X90824_s_at 


USF2a & USF2b, clone P2 


class 3 


1.6749456 


1.0247817 


0.8121015 


0.50901747 D43636_at 


KIAA0096 gene, partial cds 


class 3 


1 R7/1R1R 
I .O / **0 I O 




n RnQR9R^4 


0.5089023 L10373_at 


MXS1 Membrane component, X 












chromosome, surface marker 1 




1 6696235 


1.0166084 


0.8096402 


0.5065178 X56411jna1_at 


ADH4 gene for class II alcohol 












dehydrogenase (pi subunit), exon 1 


class 3 


1.6669358 


1.0154912 


0.80860317 


0.5050052 U67171_at 


Selenoprotein W (selW) mRNA 


class 3 


1.6513395 


1.0122768 


0.80841583 


0.5046859 Y09392_s_at 


WSL-LR, WSL-S1 andWSL-S2 












proteins 


class 3 


1.6431793 


1.0100572 


0.8075384 


0.50179076 U85707_at 


Leukemogenic homolog protein 












(MEIS1) mRNA 


class 3 


1.6431123 


1.0041083 


0.8070719 


0.50103754 Y00067_rna1_at 


Neurofilament subunit M (NF-M) 


class 3 


1.6419501 


1.0011116 


0.8067552 


0.49965492 U87223_at 


Contactin associated protein (Caspr) 












mRNA 


class 3 


1.6394771 


0.9955323 


0.80538327 


0.49950275 L10333_s_at 


Neuroendocrine-specific protein A 



(NSP) mRNA 
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class 3 


1.6353047 


0.98837924 


0.7981355 


0.49779063 


U 17838, 


at 


Zinc finger protein RIZ mRNA 


lilQOO \J 


1 6318842 


0.9861097 


0.7980577 


0.49699193 


U46901_ 


_at 


SNCA Synudein, alpha (non A4 
















component of amyloid precursor) 


class 3 


1.6295244 


0.97382015 


0.79276174 


0.4967051 


X79888. 


at 


AUH mRNA 


Uldbo \> 


1 6287894 


0 97215784 


0 79182774 


0.49586895 


U07620, 


at 


JNK3 alpha2 protein kinase (JNK3A2) 
















mRNA 


class 3 


1.6147838 


0.9710149 


0.7907015 


0.49469632 


U24152_ 


.at 


P21-activated protein kinase (Pak1) 


place 3 


1 6119615 


0.966481 


0.78996116 


0.49367088 


S72043_ 


rna 1__at 


gene 

GIF=growth inhibitory factor [human, 
















brain, Genomic, 2015 nt] 


class 3 


1.6035154 


0.9626293' 


0.78644234 


0.49325758 


D87464, 


.at 


KIAA0274 gene 


class 3 


1.5985726 


0.9546146 


0.7859533 


0.4915498 


U32439. 


.at 


Regulator of G-protein signaling 
















similarity (RGS7) mRNA, partial cds 


class 3 


1.5867528 


0.94753075 


0.7855334 


0.49130288 


L02950_ 


at 


CRYM Crystallin Mu 


class 3 


1.5784372 


0.94672173 


0.78228825 


0.48999268 


M88279 


_at 


FKBP4 FK506-binding protein 4 (59kD) 


class 3 


1.5764623 


A 4 77C 

0.94641775 


f\ 77QD1A07 


n /1QQOO/I7/1 

U.4cjy2y4/4 


A/D040_ 


_ai 


OLrvA OIUldleUUAlM \l\ IIUIll dl loloi doe ) 


class 3 


1.5748655 


0.94637823 


0.77925485 


0.48841372 


X05196_ 


at 


Aldolase C gene 


class 3 


1.5744048 


0,9456003 


0.7786697 


0.48784614 


D83407_ 


.at 


ZAKI-4 mRNA in human skin fibroblast 


class 3 


1.5715562 


0.94528913 


0.7786567 


0.48684633 


D38024, 


.at 


Facioscapulohumeral muscular 
















dystrophy (FSHD) gene region, D4Z4 
















tandem repeat unit 


class 3 


1.570073 


0.94405437 


0.776268 


0.4850091 


M98539 


-.at 


Prostaglandin D2 synthase gene 


class 3 


1.5688661 


0.9425145 


0.775566 


0.48369113 


M99063. 


_at 


KERATIN, TYPE II CYTOSKELETAL 2 
















ORAL 


class 3 


1.5683391 


0.939959 


0.77452576 


0.4827068 


U06681. 


.at 


Clone CCA 12 mRNA containing CCA 
















trinucleotide repeat 


class 3 


1.5661842 


0.93824065 


0.7729135 


0.4820274 


M22976 


_at 


CYB5 Cytochrome b-5 


class 3 


1.562671.1 


0.9377882 


0.7699611" 


0.4809086 


S77410. 


.at 


AGTR1 Angiotensin receptor 1 


class 3 


I .OOttO I o 




0 7696227 


0 4793128 


M29551 


at 


SERINE/THREONINE PROTEIN 
















PHOSPHATASE 2B CATALYTIC 
















SUBUNIT, BETA ISOFORM 


class 3 


1.5440252 


0.93378896 


0.76802355 


0.47835198 


U51477_ 


.at 


Diacylglycerol kinase zeta mRNA 


class 3 


1.5358244 


0.93254817 


0.7675914 


n /(77CDO 

0.4/ /bob^ 


K jiQOQAQ 

iviyzoUJ_ 


_at 


niwvnpnPRVRiniMF.^PM^iTi\/F I - 

Uln T \j\\\Jr X\ T r\ILJI INC-OCINOI 1 1 V C t- 














TVDfT f^AI ni IM f^WANIMPI P.PTA.1. 
I Trb, LfALUIUM onAININCL DC I A- I- 
















B1 SUBUNIT 


class 3 


1.5323644 


0.9281277 


0.7668785 


0.4765654 


U79251_ 


.at 


OPCML Opioid-binding cell adhesion 
















molecule 


class 3 


1.5293515 


0.9264165 


0.766238 


0.47640666 


Uob97U_ 


.at 


1/1 A A HO 1 C nana 

MAAU^ib gene 


class 3 


1.5226866 


0.9237222 


0.7658018 


0.47434324 


L39833_ 


at 


K+ channel beta 1a subunit mRNA, 
















alternatively spliced 


class 3 


1.5180973 


0.9220966 


0.7589294 


0.47297105 


U18937_ 


.at 


HistidyMRNA synthetase homolog 
















(H03) mRNA 


class 3 


1.516081 


0.9166196 


0.75824547 


0.47195733 


U45975. 


.at 


Phosphatidylinositol (4,5)bisphosphate 
















5-phosphatase homolog mRNA, partial 
cds 


class 4 


0.8734975 


1.134452 


0.941 1052 


0.67896336 


M80397. 


_s_at 


rULui polymerase (uina airectea^, 


















class h 


u.o i i yzoo 


fi QR3Q09 


n 8490fllfi 


0 6297001 

\J *\J *C <D 1 \J\J 1 


X14830_ 


.at 


CHRNB1 Cholinergic receptor, nicotinic, 
















beta polypeptide 1 (muscle) 


class 4 


0.81055194„ 


0.9483548 


0.8026216 


0.6025905 


K02882. 


_cds1_s_at 


IGHD gene (immunoglobulin delta- 














chain) extracted from Human germline 
















IgD chain gene, C-region, C-delta-1 
















domain 


class 4 


0.79492754 


0.9124361 


0.7859309 


0.5878106 


HG4178-HT4448_at 


AM 7 


class .4 


0.7530036 


0.8959806 


f\ 7CCDCOO 

0.7669699 


0.5ooobobf 


U97018, 


.at 


C ki i n j~vH arm m i/*- r/~it i I Kt 1 1 O C C fVM £l toH 

tcninoucrrn iiiiuruiuuuio-dbouoidicu 














nrntoin hnmnlAn M11FMAP mRMA 

proicin iiUiiiuiuy nuciviMr iiir\iN/*\ 


class 4 


0.7159336 


0.88823646 


0.7476684 


0.55oo9o5J 


X52228. 


at 


Kill 1 "i KAti/tim 1 tmnerviQi'nhr'ano 

iviul»i Mucin i , iransmcMiDrdnc 


class 4 


0.7055104 


0.87415504 


0.73530275 


0.54636556 


L18920_ 


.Lat 


MELANOMA-ASSOCIATED ANTIGEN 
2 


class 4 


0.6556214 


0.87129426 


0.72525615 


0.53645986 


D29675. 


.at 


Inducible nitric oxide synthase gene, 
















promoter and exon 1 


class 4 


0.6340733 


0.8697119 


0.71343136 


0.5269011 


S82471. 


_s_at 


SSX3=Kruppel-associated box 
















containing SSX gene [human, testis, 
















mRNA Partial, 675 nt] 


class 4 


0.6339644 


0.84462005 


0.707733 


0.52023065 


U22314_ 


_s_at 


Neural-restrictive silencer factor, splice 
















variant mRNA, partial cds 


class 4 


0.6246593 


0.83731145 


0.7006627 


0.51330495 


X74987, 


_s_at 


2-5A binding protein 
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class 4 


0.61529684 


0.82780534 


0.6935719 


0.50730973 


M17466_at 


class 4 




u.o^oozyo'f 






M9Qfi10 <i at 


class 4 


0.5956711 


0.82629377 


0.68068653 


0.4996151 


M54951_at 


class 4 


U.o904o24 


U.O20O21 


U.Of DOOUJ 


u.4youy44 


r\U<i/ DD__dl. 


class 4 


0.58323526 


0.8113131 


■ . 0,6684805 


0.49252754 


M36429_s_at 


class 4 


0.58193743 


0.8040995 


0.6674828 


0.4896936 


M95623_cds1_at 


class 4 


0.579044 


0.8023006 


0.6626152 


0.48392293 


U57592_at 


class 4 


0.5783975 


0.7968204 


0.6584121 


0.4814893 


U40223_at 


class 4 


0.5780785 


0.79416245 


0.65169525 


0.47840923 


K02777_s_at 


class 4 


0.5670055 


0.7924286 


0.64660954 


0.47577995 


U79302_at 


class 4 


0.5656941 


0.7891743 


0.6452447 


0.47280827 


U40462_at 


class 4 


0.56489813 


0.78810704 


A C3QA"704C 

0.6389721b 


a /(coooort/i 
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K02405J_at 


class 4 
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adm 1 o_s_ai 


class 4 


0.5433172 


0.767859 


0.6289676 


0.45595473 


M77140_at 


class 4 
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0.62277097 


0.44980803 
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0.75743777 
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0.53244615 


0.75203496 


0.61582357 
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class 4 


0.53053224 


0.74949163 


0.6133881 


0.44177684 


M81829_at 


class 4 


0.5266541 


0.73750824 


0.61149806 


0.4408685 


M21389_at 


class 4 


0.5265538 


0.7358704 


0.60937744 


0.43847492 


HG2147-HT2217_r_ 


class 4 


0.52553517 


0.73483276 


0.6050042 


0.4369384 


M64572_at 


class 4 


0.52480435 


0.7332647 


0.6033811 


0.43495926 


D88155_s_at 


class 4 


0.52309185 


0.7292017 


0.6027077 


0.43228412 


U88964_at 


class 4 


0.5213507 


0.7290054 


0.60209686 


0.4313736 


Z84721_cds1_at 


class 4 


0.5202241 


0.72878873 


0.5992238 


0.42955112 


X91103_at 


class 4 


0.51837975 


0.72774935 


0.5977254 


0.4274839 


U49082_at 
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0.5167581 


0.7269519 


0.5975571 


0.425647 


U05255_at 
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0.511345 


0.7261254 


0.59631515 


0.42395902 


D42039_at 


class 4 


0.51000273 


0.72500247 


0.59339905 


0.42339996 


U45448_s_at 


class 4 


0.5078437 


0.72347194 


0.5933299 


0.42207658 
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class 4 


0.5066535 


0.7224339 


0.5908667 


0.4204116 
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class 4 
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0.72010094 


0.5878204 


0.41856045 


HG2255-HT2344J_ 


class 4 


0.49719426 


0.7200707 
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0.49097803 


0.7180478 


0.5869554 


0.41703936 
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F12 Coagulation factor XII (Hageman 
factor) 

GYPE Glycophorin E 

ATRIAL NATRIURETIC FACTOR 
PRECURSOR 

C9 Complement component C9 

Transducin beta-2 subunit mRNA 

PBGD gene (hydroxymethylbilane 
synthase) extracted from Homo sapiens 
hydroxymethylbilane synthase gene 
Jumonji putative protein (jumonji) 
mRNA 

Uridine nucleotide receptor (UNR) gene 

T-cell receptor active alpha-chain 
mRNA from Jurkat cell line 
Clone 23855 mRNA, partial cds 

lkaros/LyF-1 homolog (hlk-1) mRNA 

PBX2 mRNA 

HLA CLASS II HISTOCOMPATIBILITY 
ANTIGEN, DQ(1) BETA CHAIN 
PRECURSOR ' 
KIAA0040 gene 

CBL Cas-Br-M (murine) ecotropic 
retroviral transforming sequence 
GALN Galanin 

Bullous pemphigoid autoantigen BP 180 
gene, 3' end 

Protein-tyrosine-phosphatase (tissue - 
type: testis) 

Transmembrane protein Jagged 1 (HJ1) 
mRNA 

APOH Apolipoprotein H 

HETEROGENEOUS NUCLEAR 
RIBONUCLEOPROTEIN L 
MACH-beta-4 protein 

Somatostatin receptor isoform 1 gene 

KRT5 Keratin 5 (epidermolysis bullosa 
simplex, Dowling-Meara/Kobner/Weber- 
Cockayne types) 
at Mucin 3, Intestinal (Gb:M55405) 

PTPN3 Protein tyrosine phosphatase, 
non-receptor type 3 
Steroidogenic factor 1 mRNA 

HEM45 mRNA 

Zeta-globin 1 gene extracted from 
Human DNA sequence from cosmid 
GG1 from a contig from the tip of the 
short arm of chromosome 16, spanning 
2Mb of 16p13.3 Contains alpha and 
zeta globin genes and ESTs 
Hr44 protein 

Transporter protein (g17) mRNA 

GLYCOPHORIN B PRECURSOR 

KIAA0081 gene, partial cds 

P2x1 receptor mRNA 

_at Cell Division Cycle Protein 2-Related 
Protein Kinase (Pisslre) 
CYSTATIN A 

at Phosphoribosyl Pyrophosphate 
Synthetase, Subunit lii 
CASR Calcium-sensing receptor 
(hypocalciuric hypercalcemia 1 , severe 
neonatal hyperparathyroidism) 
Homeobox protein Cdx2 mRNA 
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0.7152811 
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0.41594544 Y00318_at 


IF I factor (complement) 
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0.48947743 


0.714268 


0.5866425 


0.4146865 M76180_at 


DDC Dopa decarboxylase (aromatic L- 








amino acid decarboxylase) 


class 4 


0.4875335 


0.71361583 


0.5847286 


0.4133012 M29696_at 


IL7R Interleukin 7 receptor 
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0.48321855 


0.71123636 


0.58444405 


0.41136318 L13197_at 


PAPPA Pregnancy-associated plasma 








protein A 


class 4 


0.48314202 


0.709341 


0.583089 


0.40934148 M83664_at 


HLA-DPB1 Major histocompatibility 








complex, class H, DP beta 1 


class 4 


0.48295903 


0.70909935 


0.5790559 


0.40833262 X93996_rna1_at 


AFX protein 


class 4 


0.48157865 


0.7063981 


0.5782582 


0.4079157 HG537-HT537_at 


Collagen, Type Viii, Alpha 2 
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0.48118794 


0.7061022 


0.57807314 


0.40636182 M92432_at 
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0.70342964 


0.57601345 


0.4028365 HG2149-HT2219_at 


MUCin (tjD.lVlbf 41 ( ) 
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0.48017803 


0.7025353 


0.5742331 


0.40200925 V00532_rna1_f_at 


IFNA gene (interferon alpha-i) extracted 










Trom numan gene Tor leuKocyie ^aipnaj 
interferon C 


class 4 


0.47445068 


0.70189375 


0.57402205 


0.40101275 X03363_S_at 


tKbbz v-ero-Dt avian eryinroDiasnc 










leukemia viral oncogene homolog 2 
(neuro/glioblastoma derived oncogene 
homolog) 


class 4 


0.47439837 


0.70042795 


0.5727844 


0.40050083 M16276_at 


HLA-DQB1 Major histocompatibility 










complex, class II, DQ beta 1 


class 4 


0.4743872 


0.69863063 


0.57177055 


0.3994889 M32598_at 


RPS11 Ribosomal protein S11 


Class 4 


n 471 7R^1 


0 £59^9429 


0 57023054 


0.39857152 U50383 at 


Retinoic acid-responsive protein (NN8- 










4 AG) mRNA 


class 4 


0.4714457 


0.69539475 


0.56962925 


0.39810398 U62966_at 


Na+/nucleoside cotransporter (hCNTIc) 










mRNA 


Uldoo 4 


f) 47041 702 


0 6917778 


0 56895816 


0.3967692 X69950_s_at 


WT1 Wilms tumor 1 




n 46R28622 


"0 69176567 


0.5689231 


0.39595485 M64269_s_at 


Mast cell chymase gene 


^ lice yl 

ciass 4 


n 4fi7997Q7 


0 69092834 


0 56770706 

v«w f 1 \J 1 WW 


0.39445558 L07868_at 


ERBB4 V-erb-a avian erythroblastic 










leukemia viral oncogene homolog-like 4 


class 4 


0.4666462 


0.69058037 


0.56759816 


0.3939125 U66661_at 


GABA-A receptor epsilon subunit 










mRNA 


place A 


0 4662564 


0.68824685 


0.5647112 


0.39127818 HG4677-HT5102_s_at Oncogene Ret/Ptc2, Fusion Activated 


Uldbb 4 




0 6872949 


0 5642959 


0.39101368 J04599_at 


BGN Biglycan 


class 4 


0.45942208 


0.6856201 


0.563421 1o 


0.39028785 HG4236-HT4506J_at Zinc Finger Protein Znf138 


class 4 


0.4591495 


0.685429 


0.5624474 


0.38855514 HG3264-HT3441_at 


Af-6 (Gb:U02478) 


place 4 


0.45771 


0 683483 


0.56229144 


0.3877895 D86965_at 


KIAA0210gene 


class 4 


fi 4^71 HP 




0 SR1fi?RS7 

U.JU IOOOJi 


0.387533 M33987_at 


CA1 Carbonic anhydrase I 


class 4 


~ ~ f\ V1CC77070 


U.DO ID4UO 


U. JU I ^ i too 


0.3865487 L43576_at 


(clone EST02946) mRNA 


class 4 


0.404 y4ozz 


U.DtJUOOl 0 


n *;ftn9Q44 c ; 


0.38604084 A28102_at 


GABAa receptor a lpha-3 subunit 


class 4 


0.45359454 


0.6804817 


0.5579805 


0.38481084 M11726_at 


PPY Pancreatic polypeptide 


class 4 


0.45104054 


0.6794522 


0.55785537 


0.38300934 X59770_at 


INTERLEUKIN- 1 RECEPTOR, TYPE II 








PRECURSOR 


class 4 


0.45097542 


0.6782299 


0.5546474 


. 0.382007 L11372_at 


Protocadherin 43 mRNA, 3' end of cds 










for alternative splicing PC43-12 


class 4 


0.45055133 


0.6776299 


0.5546461 


0.38108552 X07496jat 


APOA1 Apolipoprotein A-l 


class 4 


0.45006305 


0.67592806 


0.55318505 


0.38004473 U51241_at 


CMKBR3 Chemokine (C-C) receptor 3 


class 4 


0.4486073 


0.6751799 


0.5531416 


0.37975523 L29306_s_at 


Tryptophan hydroxylase (Tph) mRNA 


class 4 


0.4483117 


0.6738043 


0.5521731 


0.37970376 L07738_at 


DIHYDROPRYRIDINE-SENSITIVE L- 










TYPE, SKELETAL MUSCLE CALCIUM 
CHANNEL GAMMA SUBUNIT 


class 4 


0.44712317 


O.b7o^0b40 


u.o4y/Uoo 


0.3794417 X99140_at 


Hair keratin, hHb5 


class 4 


0.44642755 


0.6727433 


0.5485155 


0.3787492 HG3502-HT3696_at 


Homeotic Protein Hox5.4 


place A 


0 4447QS64 


0 6722694 


0.5476115 


0.37755352 U56244_at 


HIG-1 mRNA 


class 4 


0.44423437 


0.67023844 


0.5466597 


0.3766933 X82200_at 


Staf50 mRNA 


class 4 


0.44366303 


0.668862 


0.5444793 


0.37616846 U10323_at 


Nuclear factor NF45 mRNA 


class 4 


0.44355562 


0.6666349 


0.5439289 


0.37465096 X00437_s_at 


TCRB T-cell receptor, beta cluster 


class 4 


0.44308597 


0.66651595 


0.5437791 


0.3744109 L38707_at 


Diacylglycerol kinase (DAGK) mRNA 


class 4 


0.44278786 


0.6659957 


0.54239744 


0.3734854 Z29481_at 


3-HYDROXYANTHRANILATE 3,4- 








DIOXYGENASE 


class 4 


0.4408863 


0.66582125 


0.5416646 


0.37182057 S78873_s_at 


Guanine nucleotide exchange factor 










mss4 mRNA 
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class 4 








class 4 


0.43772706 


0.66497093 


0.53880125 


class 4 


■ 0.43764022 


0.66314214 


0.5384878 


class 4 


0.4371058 


0.66192573 


0.5379855 


class 4 


0.43608817 


0.66024655 


0.537967 


class 4 


0.43561712 


0.6573707 


0.537837 


class 4 


0.43479303 


0.6573559 


0.5359841 



0.37104592 X75342_at 
0.3705083 U62433_at 

0.36978632 AB002559_at 
0.3691347 X91809_at 
0.3691051 Y09980_ma4_at 

0.36905307 HG2028-HT2082_at 
0.3681883 U81262_at 



SHB SHB adaptor protein (a Src 
homology 2 protein) 

CHRNA4 Cholinergic receptor, nicotinic, 

alpha polypeptide 4 

Hunc18b2 

GAIP protein 

HOXD3 gene 

Laminin, A Polypeptide 

EPLG5 Eph-related receptor tyrosine 

kinase ligand 5 
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Multiple tumor clustering 

The results of clustering the multi-tumor dataset A are shown below. Two 
clustering methods were used as described in the Clustering, section. Notice how 
except for the PNETs the samples cluster mostly along tissue types. The AR/RT 
sample cluster together despite coming from different locations (renal, extra renal 
and CNS). . 

Hierarchical Clustering of Multiple Tumor Samples 
Michael Eisen's clustering algorithm 
Dataset A 

Values thresholded to 100 from below and 16000 from above 
Variation filter: max/min > 12 (12-fold), max-min= 1200 absolute units 
Number of features (genes) = 1065 




• MD 

@ MGIi 

® AT/RTCNS 

® AT/RT Renal/Extra renal 

® Nicer 

$ PNET 



SOW! Clustering of Multiple Tumor Samples 
GeneCiuster algorithm 
Dataset A 

Values thresholded to 100 from below and 16000 from above 
Variation filter: max/min > 12 (12-fold), max-min= 1200 absolute units 
Number of features (genes) = 1065 
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Multiple tumor classes predictions (/c-NN) 

This section contains the detailed sample predictions and error rates of predicting 
the different tumor types with a /c-nearest neighbor algorithm in leave-one-out 
cross-validation. 

The model predicts 35 out of 42 samples correctly and it is clearly highly 
significant (P-val < 10E-10, see the calculation below and the Proportional chance 
criterion. ) 

Multiple tumor classes prediction 
/c-nearest neighbors algorithm 

Dataset A 

Values thresholded to 20 from below and 16000 from above 
Variation filter: max/min > 5 (5-fold), max-min= 500 absolute units 
Number of features (genes) = 10. K=3, 1/distance weighting 

Confusion Matrix 



Predicted 

Actual MD MGlio Rhab Ncer PNET 



MD 
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10 
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10 
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10 


Rhab 
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0 
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10 
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0 
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PNET 


3 


0 


1 


0 
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8 




11 


10 


11 


4 


6 
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Proportional chance criterion (see chapter VII of Huberty's Applied Discriminant Analysis) 

Cpro= (10/42)*(10/42)+(10/42)*(10/42)+(10/42)*(10/42)+(4/42)*(4/42)+(8/42)*(8/42) 

Cpro= 0.21542 

Pcc= 35/42= 0.833333 

(Pec - Cpro)/Sqrt(Cpro(1-Cpro)/n) = Z= 9.740725 p-val< 10E-10 



Datapoint Predicted Class Confidence True Class Error? 

Brain_MD_12 2 2.80E-04 0 * 

Brain_MD_61 4 0.002653 0* 

Brain MD 15 0 0.006245 0 
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Classic vs. desmoplastic MD markers 

This picture shows some of the top markers of the classic vs. desmoplastic 
distinction sorted by signal to noise ratios as described in Gene marker selection 
section. The table below shows the top 200 markers including the permutation test 
values (see Permutation-based neighborhood analysis for marker gene ). Some of 
the genes regulated by Shh are shown at right. 
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Top 200/200 Marker Genes for Classic vs Desmoplastic Medulloblastoma Distintion 

Seleted by signal-to-noise (mean) ratio 

Values thresholded to 20 from below and 1 6000 from above 

Variation filter: max/min > 3 (3-fold), max-min= 100 absolute units 

Dataset B 

Class 0 = High in Classic, low in Desmoplastic 
Class 1 = High in Desmoplastic, low in Classic 
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Melanocyte-specific gene 1 (msgl) mRNA 
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Transcription factor E2F like protein [human, 
mRNA, 2492 nt] 
KIAA0172 gene, partial cds 

PSMB5 Proteasome (prosome, macropain) 
subunit, beta type, 5 

Neuronal olfactomedin-related ER localized 

protein mRNA, partial cds 

GPX4 Phospholipid hydroperoxide glutathione 

peroxidase 

NUCLEOBINDIN PRECURSOR 

Amyloid precursor-like protein 1 mRNA 

KIAA0202 gene, partial cds 

Triosephosphate Isomerase 

THBS4 Thrombospondin 4 

MUSCARINIC ACETYLCHOLINE RECEPTOR 
M4 

Syntaxin 3 mRNA 

LIM-homeobox domain protein (hLH-2) mRNA 

Novel member of serine-arginine domain 
protein, SRrp129 

Mad protein homolog (hMAD-2) mRNA 
ATP1B2 ATPase, Na+/K+ transporting, beta 2 
polypeptide 

Bcl2, p53 binding protein Bbp/53BP2 

(BBP/53BP2)mRNA 

ARF4L ADP-ribosylation factor 4-like 

GNB3 Guanine nucleotide binding protein (G 
protein), beta polypeptide 3 
FPR1 Formyl peptide receptor 1 

CRABP2 Cellular retinoic acid-binding protein 
2 

COL2A1 Collagen, type II, alpha 1 

Glutamate transporter 

RPN2 Ribophorin II 

PPP2R1B Protein phosphatase 2 

HFL1 H factor (complement)-like 1 

KIAA0069 gene, partial cds 

KIAA0247 gene 

SURF1 Surfeit 1 

GP-39 cartilage protein gene 

P lectin 

PDE4C Phosphodiesterase 4C 

Alpha-1 -Antitrypsin, 5* End 

(clone 48ES4) mRNA fragment 

GABRG2 Gamma-aminobutyric acid (GABA) A 
receptor, gamma 2 

Unknown protein expressed in macrophages 
Retinal protein (HRG4) mRNA 
APK1 antigen 

KERATIN, TYPE I CYTOSKELETAL 17 

Split Gene 1 Enhancer, Tup1-Like 

PDE4B Phosphodiesterase 4B 

PPP1CA Protein phosphatase 1, catalytic 

subunit, alpha isoform 

SRP19 Signal recognition particle 19 kD 

protein 

Pyruvate dehydrogenase complex (PDHA2) 
gene 

Putative cerebral cortex transcriptional 
regulator T-Brain-1 (Tbr-1) mRNA 
34 kDa Mov34 homolog mRNA 

Ca2+-dependent activator protein for secretion 
mRNA 

Lysophospholipase homolog (HU-K5) mRNA 
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class 0 0.5747552 0.61550885 
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class 0 0.56919754 0.6072801 

class 0 0.56461215 0.6062743 

class 0 0.56319255 0.6061561 
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MSH3 MutS (E. coli) homolog 3 
Requiem homolog (hsReq) mRNA 
Protein kinase, Dyrk2 
Endogenous retrovirus mRNA for ORF 
HPCA Hippocalcin 

Protein tyrosine phosphatase (CIP2)mRNA 

CLCN4 Chloride channel 4 

FKBP4 FK506-binding protein 4 (59kD) 

MYBL1 V-myb avian myeloblastosis viral 

oncogene homolog-like 1 

SUR Sulfonylurea receptor (hyperinsulinemia) 

L-UBC 

TCRA T cell receptor alpha-chain 

Unknown protein mRNA, partial cds 

GRANZYME B PRECURSOR 

EIF-2-associated p67 homolog mRNA 

FMR1 Fragile X mental retardation 1 

JAK3 Janus kinase 3 (a protein tyrosine 
kinase, leukocyte) 

Golgi complex autoantigen golgin-97 mRNA 

Putative potassium channel subunit (h-erg) 
mRNA 

ApM1 mRNA for GS3109 (novel adipose 
specific collagen-like factor) 
Sigma receptor mRNA 

RbP gene (renin-binding protein) extracted 
from Human Xq28 genomic DNA in the region 
of the L1CAM locus containing the genes for 
neural cell adhesion molecule L1 (L1CAM), 
arginine-vasopressin receptor (AVPR2), C1 
p1 15 (C1), ARD1 N-acetyltransferase related 
protein (TE2), renin-binding protein (RbP), host 
cell factor 1 (HCF1), and interleukin-1 receptor- 
associated kinase (IRAK) genes, and Xq28lu2 
gene 

KIAA0316 gene 

SLC6A8 gene (creatine transporter) extracted 
from Human Xq28 cosmid, creatine transporter 
(SLC6A8) gene, and CDM gene, partial cds 
TIM 17 preprotein translocase 

IL15RA Interieukin 15 receptor alpha chain 
Proteasome subunit p44.5 
MYC V-myc avian myelocytomatosis viral 
oncogene homolog 

ATP6B2 ATPase, H+ transporting, lysosomal 
(vacuolar proton pump), beta polypeptide, 
56/58kD, isoform 2 
CDC25A Cell division cycle 25A 

Osteoblast specific factor 2 (OSF-2os) 

Na+,K+ -ATPase catalytic subunit alpha-Ill 
isoform gene 

Adenosine triphosphatase mRNA 

Glutamate receptor (GLUR5) mRNA 

Taxi -binding protein TXBP151 mRNA 

Suppressor for yeast mutant 

Carbonic anhydrase IV gene extracted from 
Human carbonic anhydrase IV gene, promoter 
region and 

Interieukin 11 receptor isoform (incomplete) 
CELL SURFACE GLYCOPROTEIN MUC18 
PRECURSOR 

2A8.3 gene (hereditary multiple exostoses 
gene isolog) extracted from Human 
chromosome 8 BAC clone CIT987SK-2A8 
complete sequence 
(clone S20iii15) mRNA, 3' end of cds 
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AMY 

ARF3 ADP-ribosylation factor 3 
Osteopontin gene 
RNA helicase 

Transcript associated with monocyte to 
macrophage differentiation 
Tafazzins protein 
Receptor protein 4-1 BB mRNA 
KIAA0162 gene 

CYTOCHROME C OXIDASE POLYPEPTIDE 

VIA-LIVER PRECURSOR 

Inward rectifier potassium channel Kir1.2 

(Kir1 .2) mRNA, partial cds 

SCA2 Spinocerebellar ataxia 2 

(olivopontocerebellar ataxia 2, autosomal 

dominant) 

RAGE gene (receptor for advanced 
glycosylation end products) extracted from 
Human H LA class III region containing 
NOTCH4 gene, partial sequence, homeobox 
PBX2 (HPBX) gene, receptor for advanced 
glycosylation end products (RAGE) gene, and 
6 unidentified cds, complete sequence 
Huntingtin interacting protein (HIP1) mRNA 

Cytochrome c oxidase subunit VIII (COX8) 
mRNA 

NC2 alpha subunit 

Translin associated protein X 

Na+,K+ -ATPase catalytic subunit alpha-Ill 
isoform gene 

Clathrirv Light Polypeptide B; Alt. Splice 1 
HOXD3 gene 

GABRA5 Gam ma-ami nobutyric acid (GABA) A 
receptor, alpha 5) 

Clathrin, Light Polypeptide B, Alt. Splice 2 

TRANSDUCIN-LIKE ENHANCER PROTEIN 1 
Phosphoenolpyruvate carboxylase 
GRB2 Growth factor receptor-bound protein 2 
Zinc finger protein C2H2-25 mRNA 
RP3 gene 

DEAD-box protein p72 (P72) mRNA 

CD40 CD40 antigen 

Gu binding protein mRNA, partial cds 

GRANULOCYTE-MACROPHAGE COLONY- 
STIMULATING FACTOR RECEPTOR ALPHA 
CHAIN PRECURSOR 
Protein tyrosine phosphatase mRNA * 

SYP Synaptophysin 

Estrogen sulfotransferase mRNA 

DD96 mRNA 

KIAA0382 gene, partial cds 

RIN Ric (DrosophilaHike (expressed in 
neurons) 

Rev-ErbAalpha protein (hRev gene) 

MLR Mineralocorticoid receptor (aldosterone 
receptor) 

Surface antigen mRNA 
Retina-specific amine oxidase 
ACY1 Aminoacylase 1 

ELK1 ELK1, member of ETS oncogene family 
EIF2A Eukaryotic translation initiation factor 2A 
AMELY Amelogenin (chromosome Y encoded) 
KERATIN, TYPE II CYTOSKELETAL 2 



class 0 0.54005593 0.5776254 

class 0 0.5399942 0.5769982 

class 0 0.53959274 0.5767422 

class 0 0.5386063 0.57520956 

class 0 0.5366796 0.5748313 

class 0 0.5362463 0.574607 

class 0 0.53534234 0.57422996 

class 0 0.53511363 0.574129 

class 0 0.5336789 0.5738576 

class 0 0.5328535 0.57379687 

class 0 0.53211504 0.5699487 

class 0 0.53210926 0.56966376 

class 0 0.532053 0.56789446 

class 0 0.53199273 0.56780976 

class 0 0.53159195 0.56724167 

class 0 0.5308711 0.56693166 

class 0 0.53077525 0.56625986 

class 0 0.52997464 0.56559825 

class 0 0.528599 0.5648617 

class 0 - 0.5282795 0.56396955 

class 0 0.52772367 0.5631641 

Class 0 0.5259058 0.56073135 

class 0 0.52554536 0.56066585 

class 0 0.52523357 0.5604787 

class 0 0.5247069 0.5601869 

Class 0 0.524563 0.5596961 

class 0 0.52441233 0.5587105 

class 0 0.5241388 0.55815023 

class 0 0.5238535 0.5580398 

class 0 0.52373844 0.55795956 

class 0 0.52261454 0.5571336 

class 1 0.9920165 1.1499524 

class 1 0.97661066 0.9616979 

class 1 0.9585667 0.8834012 

Class 1 0.88205796 0.84205025 

class 1 0.880672 0.8215255 

class 1 0.8662454 0.794612 

class 1 0.81006235 - 0.7840071 

class 1 0.8061057 0.7732427 

class 1 0.7963144 0.7665661 

class 1 0.7821859 0.76450706 

class 1 0.7820521 0.75147885 

class 1 0.7767357 0.7358279 

class 1 0.77340704 0.7356313 

class 1 0.7371966 0.72674626 

class 1 0.7175167 0.72600675 
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EPIDERMAL 
Luman mRNA 

ASS Argininosuccinate synthetase 
ABC3 ATP-binding cassette 3 
PLK mRNA 

ACHE Acetylcholinesterase (YT blood group) 
Protein tyrosine phosphatase 
TYK2 Protein-tyrosine kinase tyk2 (non- 
receptor) 

T-cell receptor active alpha-chain mRNA from 
Jurkat cell line 

CC chemokine LARC precursor 
KRT1 3 Keratin 13 
Sigma 3B protein 
Surfactant protein B mRNA 
Transcription Factor Mef2, Alt. Splice 2 

Immunoglobulin-like transcript-3 mRNA 
VEGF Vascular endothelial growth factor 
Neuronal DHP-sensitive, voltage-dependent, 
calcium channel alpha-2b subunit mRNA 
TRYPTOPHANYL-TRNA SYNTHETASE 

LTG9/MLLT3 mRNA, C-terminal 
ADAR Double-stranded RNA adenosine 
deaminase 
FAST kinase 

TAR RNA binding protein (TRBP) mRNA 

Clone A9A2BRB6 (CAC)n/(GTG)n repeat- 
containing mRNA 

U1-snRNP binding protein homolog mRNA 
ATP6E ATPase, H+ transporting, lysosomal 
(vacuolar proton pump) 31 kD 
AFFX-HUMRGE/M1 0098_5 (endogenous 
control) 

L43579 Scares fetal liver spleen 1 NFLS Homo 
sapiens cDNA clone 1 10298, mRNA sequence 
Clathrin, Light Polypeptide B, Ait. Splice 1 

PrP gene, exon 2 

COL2A1 Collagen, type II, alpha 1 (primary 
osteoarthritis, spondyloepiphyseal dysplasia, 
congenital) 

SWI/SNF complex 60 KDa subunit (BAF60c) 
mRNA 

PKM2 Pyruvate kinase, muscle 

Insulin-Like Growth Factor 2 

MGP Matrix protein gla 

NDP Nome disease (pseudoglioma) protein 

40S RIBOSOMAL PROTEIN S23 

SGNE1 Secretory granule, neuroendocrine 

protein 1 (7B2 protein) 

Ribosomal protein L21 mRNA 

Insulin-like growth factor binding protein 5 
(IGFBP5) mRNA 

APXL Apical protein (Xenopus laevis-like) 
RPL35A Ribosomal protein L35a 
PROBABLE G PROTEIN-COUPLED 
RECEPTOR LCR1 HOMOLOG 
BCL2 B cell lymphoma protein 2 

Decorin, Alt. Splice 1 
Ribosomal protein L39 
NB thymosin beta 
KIAA0068 gene, partial cds 



65 



class 1 


0.71659106 


0.7239192 


0.65854394 


0.53445965 


U14972 


Ribosomal protein S10 mRNA 


class 1 


0.71585566 


0.71054196 


0.651889 


0.5302352 


X59841 


PRE-B-CELL LEUKEMIA TRANSCRIPTION 














FACTO R-3 


class 1 


0.7044602 


f\ ^ f\A 1 Oil A 

0.70438224 


0.65061 134 


U.D2oo2lo 


JIM242 


ir^PO I nc 1 1 lin-lilrP ATA\A/th f^AtAr 9 ( c/^m o tr\m qH io 


class i 


0 7026445 


0 70189273 


0.6452029 


0.5213764 


HG3214-HT3391 


A ) 

Metallopanstimulin 1 


class 1 


D 6029^91 ^ 
u.oy^ajj i o 


0 7003029 


0 6411657 


0 51902276 


X06617 


RPS1 1 Ribosomal protein S1 1 


class 1 


0.69290376 


0.6944331 


n £?it A4 one 
0.6401815 


A C-JCO/I >i 

0.0 1oy44 


JU2ol 1 


rtruu MpuiipupruitsiM u 


class 1 


0.6838895 


0.6939349 


0.6379803 


0.51201856 


X16064 


TRANSLATIONALLY CONTROLLED TUMOR 














PROTEIN 


class 1 


A fi77fi«m 

\J.\J 1 I OJJJ 


0 fiQ227^74 


0 632398 


0.5075445 


Z74616 


COL1A2 Collagen, type I, alpha-2 


class 1 




0 88?fiQ42 


0 6293568 


0 5051634 


L40386 


DP2 (Humdp2) mRNA 


class 1 


n 6747287*; 

U.O/ Hi £.01 *J 


n 68257^2 




0 5020662 


X04741 


UBIQUITIN CARBOXYL-TERMINAL 














HYDROLASE ISOZYME L1 


class 1 


0.6703414 


0.6777735 


0.6220077 


0.49927366 


M55210 


LAMC1 Laminin, gamma 1 (formerly LAMB2) 


class 1 


0.66915506 


0.6772996 


0.6169228 


0.49835536 


M96739 


NSCL-1 mRNA sequence 


class 1 


0.66442496 


0.67441475 


0.6109734 


0.4961448 


U73304 


CB1 cannabinoid receptor (CNR1) gene 


class 1 


0.6633565 


0.67025864 


0.6074427 


0.49294007 


M65292 


HFL1 H factor (complement)-like 1 


class 1 


0.66297793 


0.66304475 


0.6040194 


0.49020073 


HG311-HT311 


Ribosomal Protein L30 


class 1 


U.ODZ I IOO 


U.OQZOjUD 


fi Rfi?RQ^94 


0 4881786 


M83233 


TCF12 Transcription factor 12 


class 1 


U.DD1 UDOUO 


fl RRi4RQ7 


fi Rfifi^12R 


fi 48827R8 


U07919 


ALDH6 Aldehvde dehvdroaenase 6 


class 1 


0.6483891 


O.ooOOoo 


0.59o01o4 


A /10QQ1 A7 K 


aoi yoy 


DDI i7 Dihncrtmol nrAtoin 1 7 
r\r L 1 f r\IDL/ov«/lTicH piUlclFl L' 


class 1 


0.64783746 


0.6586179 


0.5918964 


0.482616 


J04080 


C1S Complement component 1, s 














subcomponent 


class 1 


U.0444040 


n RRRQ774 




0 4810Q1B 


X76029 


NEUROMEDIN U-25 PRECURSOR 


class 1 


0.6431463 


0.6527849 


0.5874605 


0.47882786 


U14973 


ylAC DIQ/'^CrMV/IA 1 DDATCIM COO 

.4Uo KIdUoUMAL KKU 1 cIN o2y 


class 1 


0.64175445 


0.64769405 


0.5862101 


0.47660354 


U24576 


Breast tumor autoantigen mRNA, complete 










• 




sequence 


class 1 


0.63975364 


0.6461164 


0.58385 


0.47526005 


L41066 


NF-AT3 mRNA 


class 1 


0.63081163 


0.6454896 


0.58339643 


0.47360942 


X60489 


Elongation factor- 1 -beta 


class 1 


0.62705797 


0.6454799 


0.5795754 


0.4714921 


M62843 


PARANEOPLASTIC ENCEPHALOMYELITIS 














ANTIGEN HUD 


class 1 




n Rn79HA 
U.UJO / ^uo 


fi R7fi c i79Q 


0 4702747 


HG662-HT662 


Epstein-Ban* Virus Small Rna-Associated 














Protein 


class 1 


0.62111175 


0.63143706 


0.5743106 


0.46925655 


HG613-HT613 


Ribosomal Protein S12 


class 1 


0.6195382 


0.630508 


0.57252336 


0.46820134 


L42379 


Quiescin (Q6) mRNA, partial cds 


class 1 


0.6190998 


0.62700677 


0.5705358 


0.46622977 


M13241 


N-MYC PROTO-ONCOGENE PROTEIN 


class 1 


0.6153043 


0.62574947 


0.5684561 


0.46493605 


U 12404 


HSPB1 Heat shock 27kD protein 1 


class 1 


0.6119269 


0.62252253 


0.56778365 


0.4618914 


M55998 


Alpha-1 collagen type I gene, 3* end 


class 1 


U.0U4f \)( 4 


U.OZUZ 1 I O 


fi RRR7Qfi7 


fi 4Rfi^Q8?^ 


S82240 


RhoE - - - 


class 1 


0.5998984 


0.61648834 


0.5644458 


0.45931065 


U 78027 


L44L gene 


class 1 


0.5997888 


0.61597705 


0.56125534 


0.457502 


X06700 


COL3A1 Aipha-1 type 3 collagen 


class 1 


0.5986623 


0.6130399 


0.55806524 


0.45596722 


D13413 


Tumor-associated 120 kDa nuclear protein 














.p120 


class 1 


0.59865105 


0.6114418 


0.556301 


0.4549787 


M74719 


obr2-iA protein (iDtr2-TAj mKiNA, o enu 


class 1 


0.5893714 


0.60740095 


0.5559502 


0.45420048 


M93119 


INSM1 Insulinoma-associated 1 (symbol 














provisional) 


class 1 


0.5881829 


0.6069417 


0.55419403 


0.45310345 


M92287 


CCND3 Cyclin D3 


class 1 


0.5843839 


0.60632783 


0.5535847 


0.4518154 


HG33-HT33 


Ribosomal Protein S4, X-Linked 


class 1 


0.58205545 


0.60504496 


0.55251557 


0.4503006 


U 16306 


CSPG2 Chondroitin sulfate proteoglycan 2 














(versican) 


class 1 


0.57964575 


0.60460865 


0.55130297 


0.44919688 


Z37976 


LTBP2 Latent transforming growth factor beta 














binding protein 2 


class 1 


U.O/ O lOtDH 


fi Rfi^^ftd^ 


fi 548R55 


0 4471849 


X69150 


Ribosomal protein S1 8 


class 1 


0.5697472 


0.6027349 


0.54859287 


A A AC A A OAQ 

0.445442UO 


II A O UT/1Q/I7 

no4o42-H 1 4y4/ 


Kioosomai rroiein l iu 


class 1 


0.56644344 


0.60101384 


0.54806477 


0.4440113 


L41607 


GCNT2 Glucosaminyl (N-acetyl) transferase 2, 














l-branching enzyme 


class 1 


0.0Ob27o^O 


A CQ01CCO 

u.oyoiooo 


fi ^At^RQRR 

u.04o%>oyoo 


fi A49RR94R 
U.44iiOO/40 


U40 l*tO 




class 1 


0.56476176 


0.5980047 


0.54293185 


0.4405897 


M30269 


NID Nidogen (enactin) 


class 1 


0.5646692 


0.5970898 


0.54111236 


0.4390807 


X07384 


GLI Glioma-associated oncogene homolog 














(zinc finger protein) 


class 1 


0.5645039 


0.5934657 


0.5405502 


0.4387671 


L38941 


RPL37 Ribosomal protein L37 


class 1 


0.5623646 


0.5929238 


0.5385063 


0.436879 


U09953 


RPL9 Ribosomal protein L9 
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class 1 


0.5605586 


0.59292054 


0.5378748 


0.43572348 


D87464 


KIAA0274 gene 


class 1 


0 5601569 


0 5907391 


0.53619635 


0.43403146 


M 18000 


40S RIBOSOMAL PROTEIN S17 


class 1 


0.oo792d3o 


U.5oo21o4 




A ^o-l-i '3Q77 


ivis i i yo 


Intprfprnn pnn^pn^iiQ Qpniipnra 
lUODr I 1 1 lie I I trl Ul l UUi loci louo oci^Ucllwc 














binding protein 1 


class 1 


0.5550686 


0.5858174 


0.5354681 


0.432014 


L27559 


IGFBP5 Insulin-like growth factor binding 














protein 5 


class 1 


U.OOO IOOO 




A 5'*41Afi4 




S 7647 5 


NTRK3 Neurotrophic tyrosine Kinase, receptor, 














type 3 (TrkC) 


class 1 


0.5519678 


0.5820916 


0.5326356 


0.4299089 


L41067 


Transcription factor NFATx mRNA 


class 1 


0.5519046 


0.5801515 


0.53112125 


0.42853457 


X 15940 


RPL31 Ribosomal protein L31 


class 1 


0.54831284 


0.57951117 


0.5298944 


0.42775545 


M 13934 


RPS14 gene (ribosomal protein S14) 


class 1 


0.54646355 


0.57748485 


0.52962327 


0.42654306 


J04164 


RPS3 Ribosomal protein S3 


class 1 




u.o/ ooy / i 








KiriPQin-rplfltpH nrntpin n^rtial nd^ 

[All ICol! 1 1 CICJIOU piUlCIII, paillCJI vUO 


class 1 


0.54087687 


0.57616264 


0.52835536 


0.425178 


HjIQ h coa 


Ribosomal protein S24 


class 1 


0.53762615 


0.57376033 


0.52642053 


0.42397496 


Z25749 


Ribosomal protein S7 


class 1 


0.53477347 


0.5735111 


0.5255214 


0.42304212 


D87735 


CAG-isI 7 {trinucleotide repeat-containing 














sequence} 


class 1 


A C007 /ICO 

v.DoZf 4oy 


A C7*)CCnC7 




A /199'591'3R 




ITf^A7 IntPArin alnha 7R 

1 1 O/A i II llcyi III, aipi Id f D 


class 1 


r\ CO<IQA7/1 

u.bJioy/4 


A CCQ 4 COCQ 
U.ODOlDODO 




A AO\ 'XHC/X 
U.4^1 0/ f yo 


M779^9 


rxlUUbvJI 1 lal (JIUlclll yci ic aiiu iiaiir\niy 














regions 


class 1 


0.52145535 


0.5667701 


0.5209896 


0.42066354 


U29195 


NPTX2 Neuronal pentraxin II 


class 1 


0.51958996 


0.565056 


0.5194922 


0.4199424 


X67734 


AXONIN-1 PRECURSOR 


class 1 


0.5176871 


0.56338716 


0.51761514 


0.41912028 


HG4319-HT4589 


Ribosomal Protein L5 


class 1 


0 51742566 


0 5625303 


0.51681256 


0.4180165 


M14764 


NGFR Nerve growth factor receptor 


class 1 


0.51 5282 


A GCOAC/17 


A GM/19QQQ 


A A'tfZAR't -1ft 


X69391 


RPL6 Ribosomal protein L6 


class 1 




u.ooy / y4o 


A c;i990Q/l 


n 4iRnn7s^ 


L37043 


CSNK1E Casein kinase 1, epsilon 


class 1 






A c-j 994 <)C 
U.O JZ^HOD 


0 414ftQft^ 

U.*t IH030J 


HG3364-HT3541 Rlbosomal Protein L37 


class 1 




A CCQOOi 7C 

U.OO0221 /O 


A CAQQQ77C 


U.4 I^O^OUf 


D82348 


5-aminoimidazole-4-cait>oxamide-1-beta-D- 












ribonucleoti de transformylase/inosinicase 


class 1 


0.51271385 


0.55787575 


0.50949436 


0.41301724 


M64716 


RPS25 Ribosomal protein S25 


class 1 


0.5108828 


0.55782914 


0.50928015 


0.4125411 


M81757 - 


40S RIBOSOMAL PROTEIN S19 


class 1 


0 5108767 


0 557722 


0.50865674 


0.41186035 


X79234 


Ribosomal protein L11 


class 1 


0.5083245 


0.5567942 


0.5078859 


f\ AAAA~7nA~7 

0.41117817 


D 13627 


KIAA0002 gene 


class 1 


0.5066584 


0.555303 


0.5063182 


0.41037014 


M17254 


TRANSFORMING PROTEIN ERG 


class 1 


0.50393724 


0.55510676 


0.50627005 


0.40977025 


D63476 


KIAA0142 gene 


class 1 


0.5006533 


0.55366576 


0.5056975 


0.40813982 


Z46629 


SOX9 SRY (sex-determining region Y)-box 9 














(campomelic dysplasia, autosomal sex- 














reversal) 


class 1 


0.4997523 


0.5518851 


0.5054221 


0.40693 


X53777 


60S RIBOSOMAL PROTEIN L23 


class 1 


fi 40709 l^ft 


ft ^170476 


" n ^0490064 


0 4063421 


M98045 


Folylpolyglutamate synthetase mRNA 


class 1 


0.49669904 


0.55143607 


0.5025265 


0.40599987 


D23660 


RPL4 Ribosomal protein L4 


class 1 


0.4962502 


0.55131245 


0.502122 


0.40507796 


X17254 


GATA1 Transcription factor Eryfl 


class 1 


0.495279 


0.550734 


0.5010209 


0.4050369 


D79989 


KIAA0167 gene 


class 1 


0.49403584 


0.5506776 


0.49984783 


_0.4037516 


U25165 


Fragile X mental retardation protein 1 homolog 














FXR1 mRNA 


class 1 


0.49o/OoUo 




A A 

u.4yootDoo4 


A /1AOA/1Q9Q 
U.4UoU40^0 


U02031 


Sterol regulatory element binding protein-2 














mRNA 


class 1 


0.49297127 


0.5496973 


0.49817213 


0.4027928 


HG3510-HT3704 


V-Erba Related Ear-3 Protein 


class 1 


0.4915902 


0.54927313' 


0.49810657 


0.40183237 


Z74615 


COL1A1 Collagen, type I, alpha 1 


class 1 


0.4908611 


0.5478711 


0.49570867 


0.4006703 


U58682 


RPS28 Ribosomal protein S28 


class 1 


0.4892058 


0.54716134 


0.49524176 


0.39926985 


HG384-HT384 


Ribosomal Protein L26 


class 1 


0.48803678 


0.5469765 


0.49498755 


0.39867598 


X67247_ma1 


RpS8 gene for ribosomal protein S8 


class 1 


0.48778662 


0.5467125 


0.49310726 


0.39855596 


L37868_s 


POU-domain transcription factor (N-Oct-3) 


class 1 


0 4868394 


0 5464856 


0.49219152 


0.3974036 


Y09836 


3'UTR of unknown protein 


class 1 


A A OC7QQQE 


u.o44oyiy 






U03105 


B4-2 protein mRNA 


class 1 


/\ j r\ a r\~r ^\^\ C\ 

0.48197228 


0.54264337 


0.49125198 


A OACAQ7/17 

0.3960o747 


X80909 


Alpha NAC mRNA 


class 1 


0.48074037 


0.5422034 


0.490522 


0.39525136 


D86982 


KIAA0229 gene, partial cds 


class 1 


0.47754747 


0.54169583 


0.49043605 


0.39439687 


M64099 


GAMMA-GLUTAMYLTRANSPEPTI DASE 5 














PRECURSOR 


class 1 


0.47734782 


0.54160315 


0.48973796 


0.39351916 


D 15050 


Transcription factor AREB6 


class 1 


0.47724903 


0.54061115 


0.487093 


0.39287397 


U38846 


Stimulator of TAR RNA binding (SRB) mRNA 



class 1 


0.47694737 


0.5393285 


class 1 


U.4 ( OUT 1 03 


V.DOOO I ODD 


class 1 


0.47327766 


0.537135 


class 1 


0.47271132 


0.53606623 


class 1 


0.47148794 


0.53591806 


class 1 


0.47017133 


0.5355234 


class 1 


r\ A CQD-I CO/I 

U.4byol 534 


U.OoZOoODD 


class 1 


0.46942267 


0.5318093 


class 1 


0.46733344 


0.53141624 


class 1 


0.46700636 


0.52987796 


class 1 


0.46547055 


0.52970636 


class 1 


f\ A coe/tni 7 
U.4b/£b4U i ( 


U.5^yU44 ( 0 


class 1 


r\ AC'i Q7Q 


U.O^O/UD 


class 1 


0.46098718 


0.5282426 


class 1 


0.45965302 


0.5280224 


class 1 


. 0.45465934 


0.52741534 


class 1 


0.4498619 


0.5268396 


class 1 


0.44982657 


0.52639043 


class 1 


n a /1007000 
3^3 


u.ozor ooy 


class 1 


O 44878542 


0.5257578 


class 1 


0.44854823 


0.5250102 


class 1 


0.44723308 


0.52477926 


class 1 


0.446246 


0.5233783 


class 1 


0.4457633 


0.5233542 


class 1 


0.44380325 


0.52199423 


class 1 


0.44341 132 


U.OZ 14^yob 


class 1 


0.44313014 


0.52122724 


class 1 


0.4424557 


0.5204168 


class 1 


0.44135985 


0.52034736 


class 1 


0.4395942 


C\ COrt007ylC 

0.52032745 


class 1 


0.43931964 


r\ corn oqoc 
U.5<:Ulzoz5 


class 1 


n ylOQCOOOC 

Q.43o523^b 


u.5iyb/by 


class 1 


0.43827492 


0.5183492 


class 1 


0.4373Ub4b 


U.5 I OUU4 1 


class 1 


0.43694 /4b 


U.5 i / 5UU/ 


class 1 


U.43b3z^4z 


U.5 lbo3414 


class 1 


0.4341169 


0.51539576 


class 1 


0.43373224 


0.51529187 


class 1 


0.43370542 


0.5148967 


class 1 


0.43326634 


0.514756 


class 1 


0.4322948 


0.513936 


class 1 


0.42978635 


0.51332444 


class 1 


0.42779183 


0.5132742 


class 1 


0.4273552 


0.5130134 


class 1 


0.42513227 


0.51109487 


class 1 


0.42448258 


0.51027435 



0.4867437 


0.39244702 U26726 




0 3Q1 75413 U43901 rnal S 


0.48561665 


0.3909066 M30448_s 


0.48516303 


0.38992578 HG821-HT821 


0.48515683 


0.38910928 X56932 


0.48478094 


0.38841477 D87433 


ft /lft4R^1 74 


ft 3fiR101ft? 1 77886 

u.OOO Iv 1 Li f OOU 


0.48331326 


0.38732716 X04325 


0.4826737 


0.3858285 X62691 


0.4821765 


0.38530836 M23613 


0.4814417 


0.3844768 HG2994- 




HT4850_s 


fl 4ftflftQ7 
u.4ouoa< 


f) 3R32074 HG2873-HT3017 


n 4ftnno^R7 

U.4oUUyD0f 


ft ^ft9RRR44 1 IROfiPR 


0.48005548 


0.3827526 X55715 


0.47943622 


0.3822391 X07173 


0.47873193 


0.381494 211793 


0.47792363 


0.38107353 M14058 


0.47577056 


0.38061887 X56997_rna1 


ft 47 c 1 RftQft 


ft 37QR872 M31520 ma1 s 

\j.oi joo/ c ivio iw4u ilia i o 


0.4742035 


0.37922156 HG4716-HT5158 


0.47346252 


0.3788581 U29943_S 


0.47323424 


0.37794998 X03342 


0.47306406 


0.37778178 D86961 


0.47277132 


0.3767854 M87789_S 


0.47275352 


0.37630454 X12671_ma1 


U.4/ 43(541 O 


ft 07C,01QC MR94fl9 
\J.OI OO WO IVIO^H'Ut 


0.47210145 


0.37470126 D42123 


0.4712368 


0.37411138 M13450 


0.47011894 


0.37278944 L41349 


U.4byU4U4o 


U.O / ZOODOO uz/ooo 


U.4b0/ /444 


ft 779A9R77 VftftQI^ 


n /iRftt;-iRft7 
U.4booiboo 


ft *^71QilAft4 in^Rf»7 


0.467564 


0.37121144 D13370. 


ft AR57Qft7/l 

u.4bD/ yo/ 4 


ft 17ftQQftft4 nfl74fift 


ft 4Rm4il9 
U.4DO l*>*tZ 


ft ^fnn^Q IJ14Q70 


ft /lfi/iRA7R9 
U.4b4b4fuZ 


ft 37flftR^ft YQQ^O^ 


0.46385196 


0.36979818 S79522 


0.46378323 


0.36874855 U83411 


0.46348524 


0.3684767 U 1361 6 


0.4634733 


0.36746866 L04483_s 


0.46344924 


0.3671752 J00314 


0.46247655 


0.36685425 X04347_s 


0.4624057 


0.36615053 U08096 


0.4621326 


0.36527485 U31814 



0.46213248 0.36480188 D13988 
0.4615859 0.3641077 HG1515- 
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11 beta-hydroxysteroid dehydrogenase type II 
mRNA 

37 kD laminin receptor precursor/p40 ribosome 

associated protein gene 

Casein kinase II beta subunit mRNA 

Ribosomal Protein S13 

LCAT Lecithin-cholesterol acyltransferase 

KIAA0246 gene, partial cds 

Protein tyrosine phosphatase mRNA 

GJB1 Gap junction protein, beta 1, 32kD 

(connexin 32, Charcot-Marie-Tooth 

neuropathy, X-linked) 

40S RIBOSOMAL PROTEIN S15A 

NPM1 Nucleophosmin (nucleolar 
phosphoprotein B23, numatrin) 
Elastin, Alt. Splice 2 

Ribosomal Protein L30 Homolog 

Thymidine kinase 2 (TK2) mRNA 

RPS3 Ribosomal protein S3 

INTER-ALPHA-TRYPSIN INHIBITOR 
COMPLEX COMPONENT II PRECURSOR 
Selenoprotein P 

C1R Complement component C1r . 

UbA52 gene coding for ubiquitin-52 amino acid 
fusion protein 

Unknown protein gene extracted from Human 
ribosomal protein S24 mRNA 
Guanosine ^-Monophosphate Synthase 

ELAV-like neuronal protein-2 Hel-N2 mRNA 
RPL32 Ribosomal protein L32 
KIAA0206 gene, partial cds 
(hybridoma H210) anti-hepatitis A IgG variable 
region, constant region, complementarity- 
determining regions mRNA 
Hnrnp a1 protein gene extracted from Human 
gene for heterogeneous nuclear 
ribonucleoprotein (hnRNP) core protein A1 
IGFBP6 Insulin-like growth factor binding 
protein 6 
ESP1/CRP2 

ESD Esterase D/formylglutathione hydrolase 

PLCB4 Phospholipase C, beta 4 

RGP3 mRNA 

Alpha 4 protein 

C7 Complement component 7 

DNA-(APURINIC OR APYRIMIDINIC SITE) 
LYASE 

KIAA0270 gene, partial cds 

RPS5 Ribosomal protein S5 

Alpha-tubulin mRNA 

UBA52 Ubiquitin A-52 residue ribosomal 
protein fusion product 1 
Carboxypeptidase Z precursor, mRNA 

ANK3 Ankyrin G 

RPS21 Ribosomal protein S21 

mRNA fragment encoding beta-tubulin. (from 
clone D-beta-1) 

Liver mRNA fragment DNA binding protein UPI 
homologue (C-terminus) 
Peripheral myelin protein-22 (PMP22) gene, 
non-coding exon 1 B 

Transcriptional regulator homolog RPD3 
mRNA 

Rab GDI mRNA 
Transcription Factor Btf3b 
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HT1515J 



class 1 


0.4230558 


0.51021904 


0.4607458 


0.3631419 


M84711 


RPS3A Ribosomal protein S3A 


class 1 


0.42282543 


0.5098846 


0.45967656 


0.36288628 


L13698 


GAS1 Growth arrest-specific 1 


class 1 




u.ouyozu/ 




\J.O\JjCO I ooo 


1 n7Q1Q 
l_U f i? 1 3 


Hnmpnrinmain nrntpin DLX-? mRNA V pnH 


class 1 


0.4207693 


0.50906163 


0.45930198 


0.36156428 


L06505 


RPL12 Ribosomal protein L12 


class 1 


0.4203779 


0.5077076 


0.45851415 


0.3611 1 102 


L07648 


MXI1 mRNA 


class 1 


0.42026496 


0.5075645 


0.45740777 


0.360508 


WAT ,IOO — 

X07438_s 
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Classic vs. desmoplastic MD prediction results (/c-NN). 

This section contains the detailed sample predictions and error rates of predicting 
classic vs. desmoplastic in leave-one-out cross-validation with a /(-nearest 
neighbor algorithm. 

The model predicts 33 out of 34 samples correctly and it is clearly highly 
significant (P-val = 0.0000008630, see the calculation below and the Proportional 
chance criterion. ). 



Classic vs. desmoplastic medulloblastoma prediction 
/c-nearest neighbors algorithm 

Dataset B 

Values thresholded to 20 from below and 16000 from above 
Variation filter: max/min > 3 (3-fold), max-min= 100 absolute units 



Confusion Matrix 



Predicted 

Actual Classic Desmoplastic 

Classic 24 1 25 

Desmoplastic 0 9 9 

24 10 34 

Proportional chance criterion (see chapter VII of Huberty's Applied Discriminant Analysis) 

r 

Cpro= (25/34)*(25/34) + (9/34)*(9/34) 

Cpro= 0.610727 

Pcc= 33/34= 0.970588 

(Pcc-Cpro)/Sqrt(Cpro(1-Cpro)/n) = Z= 4.783099 P-val = 0.0000008630 



Num Data Num Right Num Wrong Threshold Num Abstain Abs Error ROC Error 

34 33 1 0 0 0.029412 0.02 

34 5 

Datapoint Predicted Class Confidence True Class Error? 

Brain_MD_7 0 0.302768 0 

Brain MD 59 0 0.548534 0 
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SOM clustering of treatment outcome samples. 

In order to study the unsupervised intrinsic structure of the medulloblastoma data 
we clustered the samples using a SOM algorithm. We performed multiple 
clusterings to make the sure the results were robust and reproducible and 
selected the most common clustering results as a representative. This is the 2- 
cluster scheme shown below that separates the medulloblastomas in two groups: 
CO with 23 and C1 with 37 samples. 

The only clinical attribute or observation that appears to correlate with this 
discovered classes is the abundance of ribosomal protein-encoding genes. See 
for example the list of marker genes for these classes in the SOM-discovered CO 
vs. C1 class gene markers section. 

There is a correlation of these CO and C1 groups with outcome but it is barely 
significant and it does not provide an accurate predictor of outcome (confusion 
matrix Fisher test P-value=0.104 and survival rank-log test P-value=0.027, see 
calculations below). The error rate of such a predictor will be worse than the 
corresponding to multi-gene models, staging or TrkC (see Summary of 
medulloblastoma treatment outcome predictions section.) 

At the same time it is clear these unsupervised classes provide a background on 
which the outcome markers behave differently (see the Treatment outcome 
markers section). 



Medulloblastoma Treatment outcome Clustering 

Values thresholded to 100 from below and 16000 from above 
Variation filter: max/min > 5 (5-fold), max-min= 500 absolute units 
SOM 2x1 250,000 iterations 

(most common clustering of samples in 10 independent clusterings) 
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Fisher exact test 


p-value = 


= 0.1042 







Proportional chance criterion (see chapter VII of Huberty's Applied Discriminant Analysis) 

Cpro= (39/60)*(39/60) + (21/60)*(21/60) 

Cpro= 0.545 

Pcc= 34/60= 0.5666667 

(Pcc-Cpro)/Sqrt(Cpro(1-Cpro)/n) = Z= 0.281976 p-val = 0.388980856 
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SOM-discovered CO vs. C1 class gene markers 

This picture shows some of the top markers that differentiate the CO and C1 
discovered classes sorted by their signal to noise ratios as described in the Gene 
marker selection section. 
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DGBDHBG BBBB BBGBBBBB B 0 BBQBBBDQQO BQ, .GB0BOOOGBGBQPBBBBBB| 
G 0 B GBQ BB BB3 BBBBBPBB BB 

5SBBB OD B GGGGOG BGB 0 BIB BOGGGBGDDBGGBGBGGGBBBBBBBGBB" 
B B BBB EGO BBSGBGBB U DB -BBB DB G SBGDD D„BGGQ@OE .BBOGOBBB 

GGBDBlI BDI3 BBB BBBBOBGDGBB BBSBBBOBD GGGB BG v .DGBGGBGQOBBB 
BBBGGBB B B BGGGD GG BPGBGGBG BDGBBBBD' DGBGSBGBBBBBBD BBBBB 

BOS OG GDGQG Q S 0 GOO BG 0 BOBGGQD0BBGE3:.'GBBBBBBBBB:. 

B G DG GOG GBB BBBBGG ESDDBG GG BGBBBO::GBBBBGDGBBBBBBBBBBB 
BB BBB 'BBaBBBBBBBBBBBBB GBBBG BBDBBBBBBBBQBBB .QBBBBBBBBBBBB 

BGB DG GHG BBBGGD BOG UB BB BG ODOBBOHBBOBBGDBBBBOBBBBBB ' 

0 CBD D GOO DBO OBG BG BB OBBDGBBGBDGBBGB GBBDBBBB 
B0D GO 0 D D BBG P P 0 DGBB BB SB BJBBQ. SBBPGGBCBBBBBBBB 
OBB B0SOBE3B BBBOGOGGBGDODGGGD BG BG OG ~QGDBOBDDBBGBQBBBDBO : . 

BBGBBG BGBfJGBGD BBBGGO □DBBBB3BDDBBBBB!3BPBBHB .'OBBBBBBBBBBB: 
0 0 GSG BBBG GBBBBBGBGB OGGDDBBBGDBBBBG / GOBODGBGBBQDPDGBHl'i 
2 GGGJ3G GBDBGQBBBDBGB B EGG BG BG BGBGB ."DGGGCIBGDB BBBDOBBDDB 

P BBB OP GGBBD BOGG 0 OBG BBGHD BOBOGDOGOOODDDD BBBPDBBBB 
BBG BBG GOBBBGBBGBGB B BBB BBBBBBBBBBB.. BCBBBBBBBBBBBBBBBBBB 
BBPGD0G CBBB DBBBBB 8GB OBBBBB BGODGBBH BBBBGGE BBBGBSDBBBBQ 

B GDG □ OGDBBG DGBGBG BBGGB BBDDDBGQGDBDBBBD' BG f BDGBGGGQ 
BD 0 B B G E 3 □DGBODBDODGB BBOBD B 

BQO BOB B B BBBDD BGBOGBGBBB B □GGBBBDGBBGGBDDBBBBBG BGDB : 
GQ BG 0 HOB BBDGGGG B BDDBOGBB BBBB^OBPBBQBDBBBBBBBBBBBBB 
OBODGBB 0 DB D BOBGOBOO GQBGBBBBB BDBGGDGBDGBDD ;GGBO BDGBBBG 
BBQ BOG GGGB BBBBflBBBBOGGBBGBBBBBDBBBOBBDBDDDB □BBGJBBIBBBBQ 

B BGBB BO BOGBGBBBBB 0 BBBBBGBO B GBBOGDBGBEGGBDGGBBDflflBB. 

BBB BG B BG DBB BBBBB □ DDBBBGBDGBGHGBBQDBGBB 0 BDBBBDBBBBB 

Q 0 @ B D 00 B 0 G ODD B BGDBBSBGGDBSODOBBBBHB 

0 GG BOB G GB DB G BBBB GDGBBGBGDBGDBBBQBBBB: 

BBB BBG DBB BBGOBBBBB OBGEGBBGBO BOBDD" BBD .OPGOGBBBBBBBBBBB 
BB □ GDG GGG B BBBB 0 BBBQDG D QDBOBGEBGBGGGDD0BB0BBBB r 

GDG OB DBHG BBBBBDBGBB E30 DDGGDG BB GG HGOGGBG GGBBGBBDl'BBG 
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c'ds2_iRPST4 gene (nbosomai protei n Sl4p 



M6T757Iir 
Ml8006~aP 

1^321"4^33rK^tanopansiimulrn' 
I06505Iat 



•40"S^OSOM^L"FROTBNI"S"f5~ 
40S 'WBO^WATTWTaO^T" 



Fybosomarprolein SI'S""" 



Chancer of rudimentary homolbg n^KlA**" 



I^PSSTSbosomarprotein SS 



"RPC31 l^bosornal protein C3rP 
RPCTSTrabosomal protet nTTS 
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M31520 rnal s Unknow n protein extracted from Human riBosbrnal protein S24 


mms<5 s at 


LAN/R1 Larrmin receptor (2ht5 epitope) 


ror496S_at 


Rbosdmal protein L27a mRNA 


M24194 at 
L"06797*s"ar~ 


Alpha-tubulin rrRNA 

PTOBA BCE GFWra^COOTCSDRECB'rOR LCR1 ' HOM0tlCX3™' 


015008 at 


SnRNPcore protein Sm D2 rrftttk 


72"S407_at 


RPL8 Rbosoma! protein L6 


D6234S_at "~ 1 


5-am'noimoazdle-4-carbbxarnde-1-bela 


m&672 at 


RPL7A Ribosomal protein L7a 


K M84TT6'~a'r H 


RPS25 Ribosomal protein S25 


r X52966 af " 


RPL35A Rbosomat protein L35a 


C04483 s at 
XS"CS822la"t 


RPS21 F5bosomal protein S21 


WS'"R]SD51DIWrPROTt3NL"1flA 1 


U12404_at 


HSPB1 Heat shock 27kD protein 1 


□58682 at 


RPS28 Ribosomal protein S28 


SWS7iIaT ' 


RPS9 Rbbsbn-Bl protein S9 


e K6 34*60 at 


Afpha-tubutin tsotype H2-alpha gene, last exon 


026726 at 


1 1 beta-hydroxysteroid dehydrogenase type 11 mhiNA 


v X55954_af 


RPL17 FSbosomai protein L17 


K U21090_at 


DNA polymerase delta small subunit mRNlA 


rD23660 at 


'RPL4"P5Bosomal protein L4 


X557i5 at 


RPS3 R'bosomal protein S3 


r M77232 rna1_e 


Ribosomal protein S8 gene and flanking regions 


'XTSSSTaC TRibbsbmal protein L1 1 


078027 rna3 a'L44Lgene 


•a 1954 at , 


PB^IFVteW^TYreBB^ODlAZBlNE RtCEPl OK 


L4038S s al 


W2"{Ru'mdp2) mRWA 


*HGBiOTTfl"i3_i 


Ribosomal R-otein Si 2 


Z48501 s at 


Polyadenybte binding protein 11 


•025789 at ' 


Rfosorna! protein L21 mRf44 
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TXJ79T9_al" 
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Rab' geranylgefariyl transferase, alpha-subunit ~] 
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[M74491 at 
tSBTSTat 


ARF3 ADP-ribosylation factor 3 

wcemirriDVCif«5Sr^^ 


X99664_at' 


Protein containing SH3 domain, SH3GL3 


r 079"2661at 
X0452BIiI 


-aone'23T451TiRm 


ISKBTGuarilne hucleolideTSinding protein (G protein ) 


•■058142 at 
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Carboxylesterase (hC&2) mRNA 
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Prbtocadherin 43 mRNA for abbreviated PC43 
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Serotonin N-acetyltfahsf erase gene 
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One- 18 homologue 


27i460_at 


Vacuolar-type H(+)-ATR3se 115 kDa subunit 


X65784 s al 


Cellular adhesion regulatory molecule [Hurfan. mRNA, 429 ht] 


X99683 at 


mRNA from TVC "gene 


r M6bi65 cdsl 


''HCA-OQBl gene 


h J04B15 ai 
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SNRPN SmaU nuclear nDonucleoprotein polypeptide N 


hsNAH Guanine nucleotide binding protein (G protein) 
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"093237 rna2_a r Mes!1 gene (menin) 


TJ8T802 at iRiospha tidy linos itoi 4-kinase 


r 024T52 at J P21 -activated protein kinase <Rak1 ) gene 


^0663^^420 ITAdductn. Alpha Subunit. Alt. Splice 2 


"X96484 at 1 DGCR6 protein 


D9"5740"Tna2""B 


362G6.2 gene extracted from Human chromosome 16p13.1 


D87465-aT 


"WAA*d275gene 


X75982 at 


"OX40rRSCffTORPRB30R50R 


059302 al ' 


T Steroid receptor coactivator (SRC- 1 ) mRNA 


a27831_ar" 


Strialum-ehriched phosphatase (SIHH) mRNA. partial cds 


"004811 at 


Trdphlnin mRNA 


^US9336~cas3" 


^CC7CAT^K3X^lNt5lKIGnrR^^ 


RAGE gene 


!M24899 at" " 


THRA Thyroid hormone receptor 


fDe7743 - aT 
fD874373'l 


K1AA02B7 gene, partial "cds 


r RlAA0250 gene 


!X9053d_at 


RagB protein 


r0T5T72_al 
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Treatment outcome markers 

This picture shows some of the top 50 markers of the treatment failure vs. survival 
distinction. The genes are sorted by their signal to noise ratios as described in 
Gene marker selection section. The table below shows the top 1 00 markers for 
each tumor class including the permutation test values (see Permutation-based 
neighborhood analysis for marker gene ). The samples are sorted according to 
treatment outcome status and then by membership to the unsupervised SOM- 
discovered classes CO and C1 . Notice the different behavior of markers according 
to the sample membership in those classes. For example the low in failures/high in 
survivors markers do not distinguish very well the failure samples that belong to 
the CO class. 



Failures 
C1 ,co 



Survivors 

C1 CO 



Markers of Survival 
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Markers of Treatment Failure 
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Variation filter: max/min > 5 (5- 
fold), max-min= 500 absolute 
units 

Dataset C 



class 0 = High in 
failure class, low in 
survivors class 

class 1 = High in 
survivors class, low in 
failure class 



Class Distance Perm 1% Perm 5% i 


Median (50%) Feature 


Desc 


classO 


0.79 0.8575851 0.7458573 


0.57256466 X69150_at 


Ribosomal protein S18 


classO 


0.58 0.7196305 0.6636454 


0.5213177 M36072_at 


RPL7A Ribosomal protein L7a 


classO 


0.520.69671630.62734157 


0.49769974 X13293_at 


MYBL2 V-myb avian myeloblastosis viral oncogene homolog-like 2 


classO 


0.43 0.6554627 0.6037056 


0.48016676 U14972_at 


Ribosomal protein S10 mRNA 


classO 


0.39 0.6267573 0.5743042 


0.4491 265 K031 89 J_at 


Chorionic gonadotropin (hcg) beta subunit mRNA 


classO 


-0.37 0.6124803 0.5631084 


0.44041193 L17131_ma1_at 


High mobility group protein (HMG-I(Y)) gene exons 1-8 


classO 


0.37 0.6079583 0.5574408 


0.4283559 X13482_at 


U2 SMALL NUCLEAR RIBONUCLEOPROTEIN A' 


classO 


0.360.5920371 0.5417171 


0.42088938 L1 271 1_s_at 


TKT Trans ketolase (Wernicke-Korsakoff syndrome) 


classO 


0.36 0.5873178 0.5382286 


0.41427517 L19711_at 


Dystroglycan (DAG1) mRNA 


classO 


0.35 0.5774571 0.5246414 


0.40221128 X04741_at 


UBIQUITIN CARBOXYL -TERMINAL HYDROLASE ISOZYME L1 


classO 


0.35 0.5691193 0.52006763 


0.39952728 U12404_at 


HSPB1 Heat shock 27kD protein 1 


classO 


0.35 0.564594 0.5123353 


0.39575094 U15008_at 


SnRNP core protein Sm D2 mRNA 


classO 


0.34 0.5508108 0.5075411 


0.39274555 U81375_at 


Placental equilibrative nucleoside transporter 1 (hENT1 ) mRNA 


classO 


0.34 0.545049 0.5041218 


0.3862151 X13794_ma1_at 


Lactate dehydrogenase B gene exon 1 and 2 (EC 1.1.1.27) (and joined CDS) 


classO 


0.330.5437621 0.4992256 


0.38400882 Z49148_s_at 


Enhancer of rudimentary homolog mRNA 


classO 


0.330.54141320.49528137 


0.380192 U39318_at 


AF-4 mRNA 


classO 


0.33 0.5383373 0.49330828 


0~377 14508 X67247_ma1_at 


RpS8 gene for ribosomal protein S8 


classO 


0.330.5350176 0.4877165 


0.37104735 U14968_at 


Ribosomal protein L27a mRNA 


classO 


0.33 0.5349308 0.48364687 


0.36859724 HG613-HT613_at 


Ribosomal Protein S12 


classO 


0.32 0.5341373 0.48146704 


0.36665422 D63880_at 


KIAA0159 gene 


classO 


0.32 0.5304447 0.47949836 


0.3642997 Y07604_at 


Nucleoside-diphosphate kinase 


classO 


0.32 0.5302321 0.47619662 


0.3608576 J04823_ma1_at 


Cytochrome c oxidase subunit VII! (COX8) mRNA r 


classO 


■0.31 0.52689290.47357216 


0.35861334 M13934_cds2_at 


RPS14 gene (ribosomal protein S14) extracted from Human ribosomal protein S14 gene 


classO 


0.3 0.5230379 0.4727932 


0.35644948 U30872_at 


CENP-F kinetochore protein mRNA 


classO 


0.3 0.517165 0.471055 


0.3537757 M81757_at 


40S RIBOSOMAL PROTEIN S19 


classO 


0.3 0.510794 0.46896455 


0.35145545 L07515_at 


HETEROCHROMATIN PROTEIN 1 HOMOLOG 


classO 


0.3 0.5095285 0.46605366 


0.35070398 M14328_s_at 


EN01 Enolasel, (alpha) 


classO 


0.29 0.5087160.46462357 


0.34898144 D82348_at 


5-aminoimidazole-4-carboxamide-1 -beta-D-ribonucleoti de transformylase/tnosinicase 


classO 


0.29 0.5083410.46285483 


0.3464003 D78586_at 


CAD PROTEIN 


classO 


0.290.50311120.46092945 


0.34413984 M32886_at 


SRI Sorcin 


classO 


0.28 0.5001 085 0.45781645 


0.3421644 U31556_at 


E2F5 E2F transcription factor 5, p130-binding 


classO 


0.27 0.4985688 0.4531834 


0.34044018 X94910_at 


ERp31 protein 


classO 


0.27 0.49674460.45302796 


0.33841276 Y1 031 3_at 


Nerve growth factor-rnducible PC4 homologue 


classO 


0.27 0.4962007 0.45097864 


0.3369098 S78187_at 


M-PHASE INDUCER PHOSPHATASE 2 


classO 


0.26 0.4961145 0.4478716 


0.3364148 HG2479-HT2575_s_at Helix-Loop-Helix Protein Sef2-1d 
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classO 


A or» O ^DOft7n A A A C A A 07C 

0.26 0.4889722 0.44o4497b 


a. ova A 1 0 I 11 9J\QA at 


Tumor norrncie faHnr t\zno 1 rorpntAr associated Drotein (TRAP1 ) mRNA oartial cds 


classO 


0.2b 0.48b3545 U.444/0/UO 




P.\sctin mRNA 

oysun iTiArvM 


classO 


0.2b 0.48b333B 0.444101 




ivieiaiiopansiirnuiin i 


classO 


u.zb o.4o4o/oo u.*r*tonyy£0 


ft ^9ftRo^ Hf"^4S4 n -HT4Q47 at 


AuOUSUIlldl rlUlclll Liu 


classO 


A OCA AO A A 7*iC A j| JlT/l C 

0.25 0. 484472b 0.4424^U45 


n iogqcicq n'jQan^ at 

U.j/DOD IDO UioOUD_at 


fi^TRO (^iurAnmtotn-4-hpfa_nataf tnsvltransferase 2 


classO 


A IE A AOAC\<if\A A /MO/1 1Qfl 

0.25 0.4840101 U.4404iyo 


a 00=0.7070 Y^aqcc at 


DDI iha DJhocomal nmtoln 1 ^Ra 


classO 


0.25 0.4818713 0.4Jy^0/y/ 


A 'iltZIAAR Mfi,A71fi at 


nococ DiKocAmat nrAtoin 

KroiJ Kiuosomai proieiii o^o 


classO 


0.25 0.4805192 0.4372189 


u.ozoooo^ rvio*to*t r_ai 


crcoi FihrohbcJ nrAwth fartAr rprpntnr "\ farJinnrlmnlasia thanatODhoric dwarfism) 


classO 


0.25 0.4799493 0.4344101 


A '30171AR/1 1 inQ77H at 

U.OZ 1 / OUD4 UUM/ /u_ai 


uysieine-ncn nean proiein ^ni^r^nry miain^ 


classO 


0.24 0.478935 0.43330073 


A O.OA7iniR r\1AA71 c at 

u.ozu / 1 Ujd uzo*t / j_s_ai . 


IADC Ipnlonrinn tDKIA e\/nthotaCD 

lAKa isoieucine-iAiNM syninBuiaB 


classO 


0.24 0.4770883 0.43237802 


A 0.1QC.91 YfiQQAQ mal at 


P2 Qene for c subunit of mitochondrial ATP synthase gene extracted from H. sapiens gene for mitochondrial ATf 


classO 


A Ail A <(TOC n CC A ylOAA^AAO 

0.24 0.4735355 0.43201992 


A QIPCOIAQ 1 l7fifi^R ot 

o.oioozouo u/ODJo_ai 


BKOAi •associaieo kiino oomain proiein jDnAu i / imainaa 


classO 


0.24 0.4734543 0.4286573 


0.31 740758 X79234_at 


Ribosomal protein L1 1 


classO 


0.24 0.4714663 0.42765373 


A o^eoooeQ Y1 W7R ot 

0.31b203b8 AlbJ/b_3t 


GABKij^ viamma-aminoDutync acia ^oadm^ t\ retepiur, yduinid ^ 


classO 


0.24 0.4707498 0.42b1b7 25 


ft 1iy|Qtn(l7 HiH/tOO e at 

u.o i looio / ivi i*t »yy__s_ ai 


LAMKi Lamnin receptor i^na epitope; 


classl 


0.68 0.8203825 0.71479475 


A C>( 1 7QCCC l AC/1 10 

0.541 78bbb LUb4iy_3t 


PLOD Lysyl hydroxylase 


class 1 


0.68 0.6813562 0.61684877 


0.4866072 J02b1 1_at 


APOD Apolipoprotein D 


classl 


0.62 0.6525392 0.5892bb9 


u.4b i<ciooo U0Dy/4 - _ai 


KIAA0220 gene, partial cds 


classl 


0.58 0.6052752 0.55940294 


A >1Q77KOft • l17fi7Q ot 

0.4o/ f WO Uo/O/ ^ai 


ii nMmh r-nnr^ifi^ unf irla rn^i nrotain nnH r^arahollar Honpnpratinn aniinpn fhPta-NAP) mRNA 

ixeurorvspeciTic vesicie copi pioiein aiiu \#eieucijpi ucyo ici ouui p anu^cii / 


classl 


n fi n MAO AAA f\ TZ A 70H /I 4 

0.54 0.5903411 0.5472241 


0.42boUb5 U£oybj_ai 


/^ric-O /PDOl m pMA 

opSi (vjro*:; mKiNA 


classl 


0.53 0.5833272 0.53512by 


A /I i A 7CZA TR ot 


mKiNA sequence (i oqi i- io> 


classl 


A CA A CCD4C70 A CHjIQCC 

0.52 U.5b815/3 U.5id14ob5 


a AARftQOOR 1 HR0.1A at 


pt\/z Ptc t /a riant nana A /F1 A pnhanrpr-hinflino nrotain E1AF) 


classl 


0.51 0.5620877 0.5118141 


A QQRIIOflO KyiO70ft7 at 

u.oyb I lyo/: My/ ^o/^ai 


cnTDi Cnoniai at rir-h connonro hinrtinn nrntpin 1 ihinrte to nuclpar matrixyscaffold-associatino DNA's) 

Dl opeCldl M I -1 lull Sequel IL.e UlllUIIiy piULtilll I iu 1 luucai inaiiwjuonuiu gjjuwauuy ^ ' ^ / 


classl 


0.51 0.5489072 0.5050725 


A ^QD^y^7/l I ^n-inA ot 

0.3881474 U7o1ou_at 


oOdium cnannei ^ (noNauz; mKiNA, auernaiiveiy spitceo 


classl 


0.5 0.5460772 0.49833363 


0.38249b92 b/b4/5_at 


iMlRKo i\eurotropnic tyrosine Kinase, receptor, type j \ iir\^; 


classl 


0.49 0.5402584 0.49208724 


A "37C717CC nOO-1 T/f ot 

0.37573755 Uzo1^4__at 


Unknown product 


classl 


A —? r\ f")7^A4 O A A AAA ^-4 >t O 

0.47 0.5376918 0.49007148 


0.37205702 U70867_at 


Prostaglandin transporter hPGT mRNA 


classl 


0.47 0.5376203 0.488631 64 


0.36825827 M17733_at 


Thymosin beta-4 mRNA 


classl 


0.47 0.5331 166 0.48450905 


0.36425692 L10333_s_at 


Neuroendocrine-specific protein A (NSP) mRNA 


classl 


0.46 0.5309144 0.47723606 


0.35968268 D14686_at 


AMT Qycine cleavage system protein T (aminomethyitransferase) 


classl 


0.46 0.5209687 0.47456706 


0.35637328 S66541_s_at 


B-50=neural phosphoprotein [human, Genomic, 778 nt. segment 3 of 3] 


classl 


0.46 0.5196613 0.47301 164 


0.3522324 AC002045_xpt2_s_at A-589H1.2 from Homo sapiens Chromosome 16 BAC clone CIT987-SKA-589H1 -complete genomic sequenc 


classl 


0.46 0.5092424 0.47074348 


0.34869587 M96739_at 


NSCL-1 mRNA sequence 


classl 


0.45 0.502768 0.46288192 


0.3469092 D86963_at 


PTB Ribosomal protein L26 


classl 


0.44 0.50010230.46171203 


0.3424973 U40271_s_at 


PTK7 Protein-tyros ine kinase 7 


classl 


0.44 0.4967905 0.4602382 


0.34013447 L09229_s_at 


FACL1 Long chain fatty acid acyl-coA ligase 


classl 


0.43 0.4930968 0.457573 


0.33526275 D78012_at 


CRMP1 Collapsin response mediator protein 1 


classl 


• r 

0.43 0.4905391 0.455364 


0.33248144 M74715_s_at 


IDUA Iduronidase, alpha-L- 


classl 


f\ Ant\ jtAAACAA A A C A ICC CC 

0.43 0.4900592 0.45425656 


0.32911295 HG2525-HT2621_at 


Helix-Loop-Helix Protein Delta Max, Alt. Splice 1 


classl 


0.43 0.4890777 0.4522365 


0.32650998 L32164_at 


Zinc finger protein mRNA, 3' end 


classl 


O JO A vIOAAOECA A <0/(7CDC 

0.42 0.4890256 0.44947585 


0.32418716 L04731_at 


Translocation T(4:1 1) of ALL-1 gene to chromosome 4 


classl 


0.42 0.4885243 0.4482153/ 


0.32321975 M22919_ma2_at 


MLC gene (non-muscle myosin light chain) extracted from Human non muscle/smooth muscle alkali myosin ligh 


classl 


0.42 0.4878781 0.44515502 


0.3214133 X15882_at 


COL6A2 Collagen, type VI, alpha 2 


classl 


o jo o ^ o 4 a c a a jjiincoo 

0.42 0.484461 0.4439538 


0.31991 13 U20657_at 


Ubiquitin protease (Unph) proto-oncogene mRNA 


classl 


0.420.4837765 0.442202 


0.31707913 L17327_at 


Pre-T/NK cell associated protein (3B3) mRNA, 3* end 


classl 


0.41 0.4793581 0.4388235 


0.3152247 J05412_at 


REG1A Regenerating islet-derived 1 alpha (pancreatic stone protein, pancreatic thread protein) 


classl 


0.41 0.47849210.43725225 


0.3134606 D43682_s_at 


Very-long-chain acyl-CoA dehydrogenase (VLCAD) 


classl 


0.41 0.4770678 0.43665424 


0.311 1634 X58521_at 


NUCLEAR PORE GLYCOPROTEIN P62 


classl 


0.41 0.4761319 0.43582407 


0.3108418 M21142_cds2_s_at 


Guanine nucleotide-binding protein G-s-alpha-3 gene extracted from Human guanine nucleotide-binding proteir 


classl 


0.4 0.4710167 0.43269396 


0.30824196 X52896_s_at 


RNA for dermal fibroblast elastin 


classl 


0.4 0.47033430.43064603 


0.30662587 D50663_at 


CW-1 mRNA 
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class 1 


A A A A A7ft1 1 1 A 4 90.9^0*56 


A ^A/lA9^1fi I l?R1?Q at 


ClaSS 1 


u.** u.'toooyoo u.*fto^yio 


A ^A9^Q94A MlfififiA at 
U.OUtOy^.'tO U IOQOv_dl 


classl 


A A r\ A P. A A 1 Q n /I 97R71 9 A 

U.4 U.4044iy U.4Z/0/ IZiJ 


a ^aa*;qaar 1 104941 =t 


classl 


nan 4R94^Qfi n 49^7A7n4 


A 9QQQQftRR YD7A47 at 


class 1 


A 4 A 4fiA4**R7 A 4994 99** 


A 9Q779R4 1 I7AR91 at 


classl 




A 9Q79A1 94 YQ^1 1 e 


classl 


A *50 A 4*;fiQA9ft A41799A*? 

u.oy u.^ooyo^o u.*n/z^oo 


A 9Q^ftAR4 H3A71*; vr»tH 


classl 


A QQ n /l CCR771 r\ A oco 

U.oa u.4obo771 U.4171ooo 


U.zy4oro4 Uo1y^U_at 


classl 


0.39 0.4550281 0.41622347 


0.29249102 U02619_at 


classl 


0.39 0.454998 0.41258836 


0.2910374 U14417_at 


classl 


0.39 0.4549076 0.41 155508 


0.28867468 M73547_at 


classl 


0.39 0.4530375 0.41020757 


0.28755918 U09820_s_at 


classl 


0.39 0.4513509 0.40929455 


0.2865325 X13461_s_at 


classl 


0.39 0.4505063 0.40849882 


0.28548497 256281_at 



NECDIN related protein mRNA 

Peroxisomal enoyt-CoA hydra tase-like protein (HPXEL) mRNA 
Homolog of Drosophila enhancer of split m9/m1 0 mRNA 
RRP22 protein 

Immunophilin homolog ARA9 mRNA 
Telomeric repeat binding factor (TRF1) mRNA 

Exon2a from Human PAP (pancreatitis-associated protein) gene, 5-flanking region ./ntype=DNA /annot=exon 
SRP54 Signal recognition particle 54 kD protein 
TFHIC Box B~binding subunit mRNA 

Ral guanine nucleotide dissociation stimulator mRNA, partial cds 

POLYPOSIS LOCUS PROTEIN 1 

Helicase II (RAD54L) mRNA 

CALMODU LIN-RELATED PROTEIN NB-1 

Interferon regulatory factor 3 
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fr-nearest neighbors treatment outcome prediction results 

This section contains the detailed sample predictions, error rates and survival 
analysis results for the /c-nearest neighbor algorithm. 

Medulloblastoma Treatment outcome prediction 
k-Nearest Neighbors Algorithm 

Values thresholded to 100 from below and 16000 from above 
Variation filter: max/min > 5 (5-fold), max-min* 500 absolute units 

Number of features (genes) = 8. Median based feature selection. K=5, 1/distance weighting 
4459 genes pass the filter. 

Dataset C 

Confusion Matrix 

Predicted 



Actual Survivors Failures 



Survivors 


37 


2 


39 


Failures 


11 


10 


21 




48 


- 12 


60 


Fisher exact testP-val= 




0.0002 





Proportional chance criterion (see chapter VII of Huberty^s Applied Discriminant Analysis) 

Cpro= (39/60)*(39/60) + (21/60)*(21/60) 

Cpro= 0.545 

Pcc= 47/60= 0.783333 

(Pec - Cpro)/Sqrt(Cpro(1-Cpro)/n) = Z= 3.101741 p-val = 0.000962000 



Num Data Num Right Num WrongThreshold Num Abstain Abs ErrorROC Error 





60 47 


13 


0 




60 5 






Data point 


Predicted ClassConfidence True Class Error? 


Brain_MD_1 


0 


0 


0 


Brain_MD_2 


1 


0.014714 


0 * 


Brain_MD_3 


1 


0.051905 


0 * 


Brain_MD_4 


0 


0.016268 


0 


Brain_MD_5 


1 


0.244053 


0 * 


Brain_MD_6 


0 


0.00339 


0 


Brain_MD_7 


1 


0.174131 


0* 


Brain_MD_8 


1 


0.021684 


0* 


Brain_MD_9 


0 


0 


0 


Brain_MD_10 


0 


0.077769 


0 


Brain_MD_1 1 


0 


0.238361 


0 


Brain_MD_12 


0 


0.395321 


0 


Brain_MD_13 


1 


0.117485 


0 * 


Brain_MD_14 


1 


0.621401 


0 * 


Brain_MD_15 


0 


0.163458 


0 


Brain_MD_16 


0 


0.001465 


0 


Brain MD 17 


0 


0.368863 


o ■ 



Brain_MD_18 


1 


0.196423 


0 * 


BrainJVID_19 


1 


0.083515 


0 * 


Brain_MD_20 


1 


0.131556 


0 * 


BrainJVID_21 


1 


0.236 


0 * 


Brain_MD_22 


1 


0.119483 




Brain_MD_23 


1 


0.449442 




Brain JVID_24 


1 


0.087128 


1 


Brain_MD_25 


1 


0.002469 


1 


Brain_MD_26 


1 


0.004054 


1 


Brain_MD_27 


1 


0.229156 


1 


Brain_MD_28 


. 1 


0.214794 


1 


Brain_MD_29 


.1 


0.132556 


1 


Brain_MD_30 


1 


0.004142 


1 


BrainJVID_31 


1 


0.071982 


1 


Brain JVID_32 


1 


0.15699 


1 


Brain_MD_33 


1 


0.070619 


1 


Brain_MD_34 


1 


0.086266 


1 


Brain_MD_35 


1 


0.095713 


1 


Brain_MD_36 


0 


0.134619 


1 * 


Brain_MD_37 


1 


0.115611 


1 


Brain_MD_38 


1 


1.27E-04 


1 


Brain_MD_39 


1 


0.085404 


1 


Brain_MD_40 


1 


0.175227 


1 


Brain_MD_41 


0 


0.001709 


1 * 


Brain_MD_42 


1 


0.434137 


1 


BrainJVID_43 


1 


0.042809 


1 


Brain_MD_44 


1 


0.038684 


1 


Brain_MD_45 


1 


0.012557 


1 


Brain_MD_46 


1 


0.190361 


1 


Brain_MD_47 


1 


0.078001 


1 


Brain JVID_48 


1 


0.028872 


1 


Brain_MD_49 


1 


0.209988 


1 


Brain_MD_50 


1 


0.440045 


1 


Brain_MD_51 


1 


0.186536 


1 


Brain_MD_52 


1 


0.32828 


1 


Brain_MD_53 


1 


0.01044 


1 


Brain_MD_54 


1 


0.096563 


1 


BrainJVID_55 


1 


0.00146 


1 


Brain_MD_56 


1 


0.310485 




Brain_MD_57 


1 


0 




Brain_MD_58 


1 


0.00709 




Brain_MD_59 


1 


0.010633 




Brain_MD_60 


1 


0.059727 





Kaplan Meier Plot 



Survival 



p-val=3.3e-06 



i 

40 



20 



80 



100 



120 



Time [months] 



140 
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Permutation test for k-nearest neighbor outcome predictor 

The picture below shows the permutation test for the k-nearest neighbor predictor 
using the method described in the Permutation Test for Outcome Predictor 
section. 



Number of genes parameter (ng) values: 1 ,2,3,4,5,6,7,8,9, 1 0,1 5,25,50, 1 00 
Number of neighbors (k) parameter values: 3, 5 
Number of random permutations was 1000 

There are 9 k-NN random models better (lower error rates) than the actual k- 
NN model (k=5, ng=8) that achieves 13 errors. The significance is 9/1000 = 
0.009. 




Weighted voting treatment outcome prediction results 

This section contains the detailed sample predictions, error rates and survival 
analysis results for the weighted voting algorithm. 

Medulloblastoma treatment outcome prediction 
Weighted voting algorithm 

Values thresholded to 100 from below and 16000 from above 
Variation filter: max/min > 5 (5-fold), max-min= 500 absolute units 
4459 genes pass the filter. 

Dataset C 

Confusion Matrix 



Predicted 
Actual Survivors 
Survivors 
Failures 



Failures 

35 4 39 

10 11 21 

45 15 60 



Datapoint 


Predicted Class 


Confidence True Class Error? Final Pred T 


Brain_MD_1 


0 


0.3952327 


0 


0 


Brain_MD_2 


1 


0.4901857 


0 * 


1* 


Brain_MD_3 


0 


0.7813152 


0 


0 


Brain_MD_4 


0 


0.5952852 


0 


0 


Brain JVID_5 


1 


0.3449895 


0 * 


1* 


Brain_MD_6 


0 


0.6714032 


0 


0 


Brain_MD_7 


1 


0.0853546 


0 * 


V 


Brain_MD_8 


1 


0.0744516 


0 * 


1* 


Brain JVID_9 


1 


0.339672 


0 * 


r 


Brain_MD_10 


0 


0.8249887 


0 


0 


Brain_MD_11 


0 


0.4271654 


0 


0 


Brain_MD_12 


0 


0.1754278 


0 


1* 


Brain_MD_13 


0 


0.3934312 


0 


0 


Brain_MD_14 


1 


0.7060123 


0 * 


• r 


Brain_MD_15 


1 


0.2107395 


0 * 


r 


Brain_MD_16 


0 


0.3573091 


0 


0 


BrainJVID_17 


0 


0.3510691 


0 


0 


Brain_MD__18 


0 


0.4114339 


0 


0 


Brain_MD_19 


0 


0.3019681 


0 


0 


Brain_MD_20 


1 


0.5024587 


0* 


1 * 


Brain__MD_21 


1 


0.4159824 


0* 


1 * 


Brain_MD_22 


1 


0.0565264 






Brain_MD_23 


1 


0.3096231 






Brain_MD_24 


1 


0.0712419 






Brain_MD_25 


1 


0.3827674 






Brain_MD_26 


0 


0.3012823 


1 * 


0* 



Brain JVID_27 


1 


0.3565034 


1 


Brain_MD_28 


1 


0.2837476 


1 


Brain_MD_29 


1 


0.1805679 


1 


Brain_MD_30 


0 


0.2770377 


1 * 


Brain_MD_31 


0 


0.0251648 


1 * 


BrainJVID_32 


1 


0.2175978 


1 


Brain_MD_33 


0 


0.4701843 


1 * 


Brain_MD_34 


1 


0.6038642 


1 


Brain_MD_35 


0 


0.2912066 


1 * 


Brain_MD_36 


0 


0.6073407 


1 * 


Brain_MD_37 


1 


0.0385693 


1 


Brain_MD_38 


1 


0.1748749 


1 


Brain_MD__39 


1 


0.1087455 


1 


BrainJVID_40 


1 


0.0164151 


1 


Brain_MD_41 


1 


0.0581653 


1 


BrainJVID_42 


1 


0.0824984 


1 


Brain_MD_43 


1 


0.0644064 


1 


Brain_MD_44 


1 


0.0228837 


1 


Brain_MD_45 


1 


0.4116783 




Brain_MD__46 


1 


0.3631051 




Brain JVID_47 


1 


0.4536451 




Brain_MD_48 


. 1 


0.3325118 




Brain_MD_49 


0 


0.3832514 


1 * 


Brain__MD_50 


1 
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SVM treatment outcome prediction results 

This section contains the detailed sample predictions, error rates and survival 
analysis results for the SVM algorithm. 

Medulloblastoma treatment outcome prediction 
SVM algorithm 

150 genes 

Dataset C 



Confusion Matrix 

Predicted 



Actual 


Survivors 




Failures 




Survivors 




33 


6 




Failures 




9 


12 


21 






42 


18 


60 


Datapoint 


Predicted Class 


Confidence True Class 


E 


Brain_MDJ 




0 


-0.793188 


0 


Brain_MD_2 




1 


1.35161 


0* 


Brain_MD_3 




0 


-0.0212459 


0 
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0 
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SPLASH treatment outcome prediction results 

This section contains the detailed sample predictions, error rates and survival 
analysis results for the SPLASH algorithm. 

Medulloblastoma treatment outcome prediction 
SPLASH algorithm 

Dataset C 



Confusion Matrix 

Predicted 

Actual Survivors Failures 



Survivors 


32 


7 


39 


Failures 


8 


13 


21 




40 


20 


60 


Datapoint 


Predicted Class Confidence True Class 


Error? 
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-0.731 


0* 


Brain_MD_9 


o 


1.41 


0 


Brain_MD_10 


0 


1.582 


0 


BrainJVID_1 1 


0 
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TrkC treatment outcome prediction results 

This section contains the detailed sample predictions, error rates and survival 
analysis results for the single-gene TrkC predictor. 

TrkC single-gene predictor 

Values thresholded to 100 from below and 16000 from above 
Variation filter: max/min > 5 (5-fold), max-min= 500 absolute units 
Number of features (genes) = 1 = TrkC 



Dataset C 

Confusion Matrix 



Predicted 
Actual Survivors Failures 

Survivors 23 16 39 

Failures 4 17 21 - 

27 33 60 



Num Data Num Right Num Wrong Threshold Num Abstain Abs Error ROC Error 
60 40 20 0 0 0.333333 0.300366 



60 5 
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Staging treatment outcome prediction results 

This section contains the detailed sample predictions, error rates and survival 
analysis results for staging as a predictor. 

Staging predictor 

MO = no metastases, Mx = the following, M1 is positive CSF cytology, 
M2 is local metastases, M3 is metastases throughout the central 
nervous system, M4 is metastases beyond the central nervous system 



Dataset C 

Confusion Matrix 



Predicted 
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Combined treatment outcome predictors 

This section describes the results for two combinations of models using a simple 
majority voting rule. The two combinations are Staging + k-NN +. TrkC and SVM + 
k-NN + TrkC. These combinations achieve better performance than any single 
method alone. 

Combined model I: staging, k-NN and TrkC 

Medulloblastoma treatment outcome prediction 
Combined predictor: Staging, k-NN and TrkC 

Dataset C 
Confusion Matrix 



Predicted 
Actual Survivors Failures 



Survivors 


35 


4 


39 






Failures 


8 


13 


21 








43 


17 


60 






Datapoint 


Staging k-NN pred. 
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Combined majority predictor 


error? 
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Combined model II: SVM, k-NN and TrkC 

Medulloblastoma treatment outcome prediction 
Combined predictor: SVM, k-NN and TrkC 



Dataset C 

Confusion Matrix 



Predicted 
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Summary of medulloblastoma treatment outcome predictions 

The following table summarizes the results for the different prediction algorithms 
dataset C. 

All of the multi-gene algorithms achieve similar performance and are better 
classifier of treatment outcome than staging of TrkC alone. 

Notice the asymmetry in the number of false negatives and positives between 
TrkC and the other algorithms. 

Summary of treatment outcome 
prediction performance 

Dataset C 



Algorithm Total Total Errors in Errors in KM Rank Test 

Correct Errors Failure Class Survival Class P-value 



Staging 


41 


19 _ 


11 


8 


0.03 


TrkC 


40 


20 


4 


16 


0.0024 


Weighted Voting 


46 


14 


10 


4 


0.00005 


SVM 


45 


15 


9 


6 


0.000027 


/c-nearest neighbors 


47 


13 


11 


2 


3.30E-06 


SPLASH 


45 _ 


15 


8 


7 


2.89E-06 


Combined model 1 
Staging, /c-NN and TrkC 


48 


12 


8 


4 


1.10E-06 


Combined model II 
SVM, /c-NN +and TrkC 


48 


12 


"8 


4 ' 


1.12E-08 
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Improvements of multi-gene prediction algorithm (k-NN) over 
staging and TrkC. 

The following graph shows the results of taking low and high risks as defined by 
staging and TrkC expression and further classified them by using the /c-NN 
algorithm results. As can be seen on the right side survival plots the multi-gene 
algorithm (/c-NN) is able to resolve "survival" with additional resolution that 
translates in significant p-values for some of the /c-NN subgroups. Notice in 
particular how the /c-NN algorithm appears to correct the mistakes made by the 
staging classifier in low staging risk patients (e.g. those with small tumors but bad 
outcomes) and low TrkC patients (e.g. those with low TrkC expression which 
respond to treatment.). This confirms that there is additional information contained 
in the multi-gene expression profiles not necessarily contained in Staging and 
TrkC expression alone (the /c-NN model did not used TrkC as one of its marker 
genes) 

Kaplan Meier Plots 



TrkC: Low 
n=33 



/c-NN Model 




P-val= 0.00244 



/c-NN Model 



TrkC: High 
n=27 



P-val= 0.0231 



/c-NN Model 



Staging: M0 
n=42 



1 



P-val=0.0111 



/c-NN Model 



Staging: M1-4 
n=18 




P-val= 0.00163 
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k-NN predictions in subgroup treated with vincristine, cisplatin 
and Cytoxan. 

All patients received a chemotherapy regime of vincristine and cisplatin.. Some 
patients also received Cytoxan or a combination of some of the following: Cytoxan, 
etoposide, CCNU, carboplatin, procarbazine, methotrexate and thiotepa. In order 
to test if our multi-gene prediction algorithm was somehow predicting outcome 
based on differences in these additional detail of the chemo regime we decided to 
analyzed the predictions in the a subset that received an identical regime of 
vincristine, cisplatin and Cytoxan. The survival Kaplan Meier plot below shows that 
the model clearly resolves the failure and survival class inside this group and it is 
therefore not a proxy of the type of chemotherapy. 



Kaplan Meier Plot 




Survival 



p-val=0.0012 
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Comparison between signal-to-noise and t-test statistic metrics 



In order to better understand the effect of using a more standard metric for gene 
selection we repeated some of the analysis of the Medulloblastoma treatment outcome 
dataset using a t-test statistic metric: 



t= ((Mo-myV(oo 2 /No + ai 2 /Ni)) f 



where Ni and N 2 represents the number of samples in each class. 

The results obtained using this metric are very similar to the ones obtained by the 
signal-to-noise metric used in the paper. This is something we had observed in other 
cases (datasets) as well. The following plot shows the total error rate in cross- 
validation as a function of the number of genes for k-NN models using both metrics. 
These models used k=5 and the same filtering parameters as those used in the "k- 
nearest neighbors treatment outcome prediction results" section earlier in this 
document. 



45 
40 
35 
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10 100 
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Signal-to-noise 
t-test statistic 



Signal-to-noise: (p, 0 -|i ,)/(a 0 + a ,) 
T-test statistic: (n 0 - ji Jfrl(a 2 0 /H 0 + o 2 , /N,) 



The similarity of the k-NN model results is a consequence of the fact that the two 
metrics produce very similar ranking of features. The next plot compares the rankings 
produced by both metrics using a color scheme. The parameter f determines how many 
differences in rank are allowed to consider the gene to be in the "same" rank (green). 
We show results for f = 5 and 0 (exact matching). This comparison shows the close 
similarity in the rankings obtained using the two metrics. 



green -> abs(difference in rank) <= f 

Ranking comparison for red -> rank is h] Q her in s '9 nal t0 noise 

blue -> rank is lower in signal to noise 
Signal tO noise and t-test metrics white -> missing in t-test (found in signal to noise) 
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